This script is used to analyze siRNA screen data obtained with High-Content Image analysis. In particular, the siRNA library used is the ThermoFisher Silencer Select hEpigenetics and hNuclear Envelope library. The library contains 3 oligo siRNA sequences per gene against 521 and 346 genes, respectively, involved in chromatin regulation and in the nuclear enevelope. The library was originally received dried in 96 well plates at 0.25 nM concentration. It was resuspended in 50 ul ddH20 (5 uM final), frozen and then thawed to increase oligo siRNA solubility. Column 12 of the 96-well plates was left empty on purpose. The resuspended libraries (18 + 12 plates total), were then trasnferred and compressed into 384-well format to obtain 5 + 3 Mother plates (Plate 4 and plate 5 of the hEpigenetics library are not full). For every mother plate, 3 Daughter plates were generated at a 400 nM final concentration. Columns 23 and 24 were originally left empty, and siRNA controls were added at a later stage. For each Daughter plate, 3 Imaging assay ready plates were generated by spotting 2 ul of siRNA oligo at the bottom of the plate. All these liquid handling operations were performed using a PerkinElmer Janus instrument, which output all the liquid handling operations logs as text files. Imaging Assay plates were dried, frozen and then used in reverse transfection experiments. For reverese transfectio, plates were thawed, spinned and 20 ul of Optimem + 0.05 ul/well of RNAiMax were added to the plates and incubated for 30’. Then 20 ul of cells were added on top of the siRNA/RNAiMax mix and incubated for 72 hrs. Fixed and stained plates were imaged on an Opera QEHS microscope using a 40X water immersion objectives. Images were analyzed in Columbus 2.6, and image analysis results were exported as tab delimited .txt files. The script reads the Janus logs files, the Columbus image analysis results and generates all the necessary .txt files for the analysis of the siRNA screen results in cellHTS2. In addition, it also runs the cellHTS2 analysis and output an html report.
library(plyr)
library(ggplot2)
library(stringr)
library(knitr)
library(data.table)
library(cellHTS2)
## Loading required package: RColorBrewer
## Loading required package: Biobase
## Loading required package: BiocGenerics
## Loading required package: parallel
##
## Attaching package: 'BiocGenerics'
## The following objects are masked from 'package:parallel':
##
## clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
## clusterExport, clusterMap, parApply, parCapply, parLapply,
## parLapplyLB, parRapply, parSapply, parSapplyLB
## The following objects are masked from 'package:stats':
##
## IQR, mad, xtabs
## The following objects are masked from 'package:base':
##
## anyDuplicated, append, as.data.frame, cbind, colnames,
## do.call, duplicated, eval, evalq, Filter, Find, get, grep,
## grepl, intersect, is.unsorted, lapply, lengths, Map, mapply,
## match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
## Position, rank, rbind, Reduce, rownames, sapply, setdiff,
## sort, table, tapply, union, unique, unsplit
## Welcome to Bioconductor
##
## Vignettes contain introductory material; view with
## 'browseVignettes()'. To cite Bioconductor, see
## 'citation("Biobase")', and for packages 'citation("pkgname")'.
## Loading required package: genefilter
## Loading required package: splots
## Loading required package: vsn
## Loading required package: hwriter
## Loading required package: locfit
## locfit 1.5-9.1 2013-03-22
## Loading required package: grid
##
## Attaching package: 'cellHTS2'
## The following object is masked from 'package:ggplot2':
##
## annotate
library(ggthemes)
Choose a Columbus image analysis result variable that needs to be analized in cellHTS2. This might change from analysis to analysis. The Janus and Columbus files should be in dedicated Janus. The chunk below assigns the Columbus variable chosen for analysis and sets the cellHTS2 input and output directories names accordingly .
meas_select <- quote(`Nuclei Selected - Nucleus Area [µm²] - Mean per Well`)
Janus_output_dir <- "Janus_output"
HTS2_input_dir <- paste("cellHTS2_input",
as.character(meas_select),
sep = "_")
HTS2_output_dir <- paste("cellHTS2_output",
as.character(meas_select),
sep = "_")
Generate the cellHTS2_input_Nuclei Selected - Nucleus Area [µm²] - Mean per Well and HTS2_output_dir directories for the cellHTS2 input and output files, respectively, to be created by this script. In addition, create an empty template for the cellHTS2 description file in the same directory. This file contains information about the experiment. Fill it in in a text editor before continuing. Without this file cellHTS2 won’t run.
if(!dir.exists(HTS2_input_dir)){
dir.create(HTS2_input_dir)
}
if(!dir.exists(HTS2_output_dir)){
dir.create(HTS2_output_dir)
}
if(!file.exists(paste0(HTS2_input_dir,"/Description.txt"))) {
templateDescriptionFile(filename = "Description.txt", path = HTS2_input_dir)
}
## [1] "cellHTS2_input_Nuclei Selected - Nucleus Area [µm²] - Mean per Well/Description.txt"
Read the siRNA layout information provided by ThermoFisher and select only relevant columns. The files should be in a directory named Janus Output.
dt_nuc <- fread(paste0(Janus_output_dir, "/AMO20JUZ-Sil Sel NucEnvelope siRNA Lib-96 well-Total.txt")) # Pattern for Nuc Envelope Thermo Gene list file
dt_ep <- fread(paste0(Janus_output_dir, "/A30085-AMO20K2X-Sil Sel Hm Epigenetics siRNA Lib-96 well-Total.txt")) # Pattern for Epigenetics Thermo Gene list file
dt_gl <- rbindlist(list(dt_nuc, dt_ep))
setnames(dt_gl, c("Location (Row-Col)", "Plate ID"), c("Ambion_well", "Ambion_barcode"))
names_filter <- quote(list(`Lot Number`,
Ambion_barcode,
`Plate Name`,
Ambion_well,
Row,
Col,
`RefSeq Accession Number`,
`Gene Symbol`,
`Full Gene Name`,
`Gene ID`,
`siRNA ID`,
`Exon(s) Targeted`,
`Sense siRNA Sequence`,
`Antisense siRNA Sequence`, Validated))
dt_gl_filter <- dt_gl[!(`Gene ID` == ""), eval(names_filter)]
dt_gl_filter
## Lot Number
## 1: AMO20JUZ
## 2: AMO20JUZ
## 3: AMO20JUZ
## 4: AMO20JUZ
## 5: AMO20JUZ
## ---
## 2589: A30085-AMO20K2X-Sil Sel Hm Epigenetics siRNA Lib-96 well-02
## 2590: A30085-AMO20K2X-Sil Sel Hm Epigenetics siRNA Lib-96 well-02
## 2591: A30085-AMO20K2X-Sil Sel Hm Epigenetics siRNA Lib-96 well-02
## 2592: A30085-AMO20K2X-Sil Sel Hm Epigenetics siRNA Lib-96 well-02
## 2593: A30085-AMO20K2X-Sil Sel Hm Epigenetics siRNA Lib-96 well-02
## Ambion_barcode Plate Name Ambion_well Row Col
## 1: CPF203AA 0006151435-A1 A1 A 1
## 2: CPF203AA 0006151435-A1 A2 A 2
## 3: CPF203AA 0006151435-A1 A3 A 3
## 4: CPF203AA 0006151435-A1 A4 A 4
## 5: CPF203AA 0006151435-A1 A5 A 5
## ---
## 2589: CPF2039H Hm Epigenetic siRNA Lib V1-C2-2 G11 G 11
## 2590: CPF2039H Hm Epigenetic siRNA Lib V1-C2-2 H1 H 1
## 2591: CPF2039H Hm Epigenetic siRNA Lib V1-C2-2 H2 H 2
## 2592: CPF2039H Hm Epigenetic siRNA Lib V1-C2-2 H3 H 3
## 2593: CPF2039H Hm Epigenetic siRNA Lib V1-C2-2 H4 H 4
## RefSeq Accession Number Gene Symbol
## 1: NM_001014977 BANF2
## 2: NM_001032283 TMPO
## 3: NM_003860 BANF1
## 4: NM_080668 CDCA5
## 5: NM_024624 SMC6
## ---
## 2589: NM_152489 UBE2U
## 2590: NM_138287 DTX3L
## 2591: NM_201286 USP51
## 2592: NM_152617 RNF168
## 2593: NM_203494 USP50
## Full Gene Name Gene ID siRNA ID
## 1: barrier to autointegration factor 2 140836 s195742
## 2: thymopoietin 7112 s14233
## 3: barrier to autointegration factor 1 8815 s16807
## 4: cell division cycle associated 5 113130 s41424
## 5: structural maintenance of chromosomes 6 79677 s36079
## ---
## 2589: ubiquitin-conjugating enzyme E2U (putative) 148581 s45194
## 2590: deltex 3-like (Drosophila) 151636 s45593
## 2591: ubiquitin specific peptidase 51 158880 s46137
## 2592: ring finger protein 168 165918 s46601
## 2593: ubiquitin specific peptidase 50 373509 s51516
## Exon(s) Targeted Sense siRNA Sequence Antisense siRNA Sequence
## 1: Not Determined CCUGUAGACACAAACCUCAtt UGAGGUUUGUGUCUACAGGaa
## 2: 3,3,3 GAAUGGAAGUAAUGAUUCUtt AGAAUCAUUACUUCCAUUCtg
## 3: Not Determined AGUUUCUGGUGCUAAAGAAtt UUCUUUAGCACCAGAAACUgg
## 4: Not Determined GCAGGGAGCUUACUAAGGAtt UCCUUAGUAAGCUCCCUGCca
## 5: Not Determined GAGGUUUUAUAUAACCGAUtt AUCGGUUAUAUAAAACCUCag
## ---
## 2589: 8 CAUUGGACAGUAUUACAAAtt UUUGUAAUACUGUCCAAUGaa
## 2590: Not Determined GCGUAUUAGGAGUCUCAGAtt UCUGAGACUCCUAAUACGCga
## 2591: Not Determined GGAUCCAUGCAGAACAUUUtt AAAUGUUCUGCAUGGAUCCat
## 2592: 6 GAAGAUAUGCCGACACUUUtt AAAGUGUCGGCAUAUCUUCta
## 2593: Not Determined CUCACUAACUUGGACCUCAtt UGAGGUCCAAGUUAGUGAGtg
## Validated
## 1: NA
## 2: NA
## 3: NA
## 4: NA
## 5: NA
## ---
## 2589: NA
## 2590: NA
## 2591: NA
## 2592: NA
## 2593: NA
Set RegEx patterns for directory searches for logs file data and spot data on a per protocol step basis.
pat_r <- "Resuspension_Source_Plates_\\d*\\.csv$"# Pattern for Janus Resuspension Log files
pat_c <- "Compression_\\d*\\.csv$"# Pattern for Janus Compression Log files
pat_d <- "Daughter_Plates_\\d*\\.csv$"# Pattern for Janus Compression Log files
pat_s <- "Image_Plates_\\d*\\.csv$"# Pattern for Janus Spotting Log files
pat_ac <- "Add_siRNA_Controls_\\d*\\.csv$"# Pattern for Janus add Controls step log files
pat_col <- "result\\.1\\.txt"# Pattern for Columbus results files
Create a list of the RegEx patterns set in the previous chunk. Important: the list names will be carried over all the next steps!!!
pat_list <- list(r = pat_r, c= pat_c, d = pat_d, s = pat_s, ac = pat_ac, col = pat_col)
pat_list
Recursively search the working directory and its subdirectories for files whose name includes the RegEx patterns defined two chunks above. The path_list functon outputs absolute file names. path_list is a list containing all the filenames on a per Janus step basis.
list_files <- function(x){
dir(pattern = x, full.names = TRUE, recursive = TRUE, include.dirs = TRUE)
}
path_list <- llply(pat_list, list_files)
path_list
Extract file names from absolut path and set them as list element names.
trim_names <- function(x){
names(x) <- basename(x) # This assigns the filename to the file that it is read
y <- x ## This is necessary because of scoping issues
}
path_list <- llply(path_list, trim_names)
Recursively read and merge object level data files as data.frames. Rows are labeled with relative filenames (The .id variable). This and the previous chunks are slightly modified tricks adopted from H. Wickam “Tidy Data” paper.
read_merge <- function(x){
dt <-as.data.table(ldply(x, fread, integer64 = "character"))
}
dt_list <- llply(path_list, read_merge)
Separate Columbus data from the other classes of data.
dt_col <- dt_list$col
dt_list <- dt_list[1:5]
dt_col
## .id
## 1: 161212_Epi_Env_Screen_v3[103387].result.1.txt
## 2: 161212_Epi_Env_Screen_v3[103387].result.1.txt
## 3: 161212_Epi_Env_Screen_v3[103387].result.1.txt
## 4: 161212_Epi_Env_Screen_v3[103387].result.1.txt
## 5: 161212_Epi_Env_Screen_v3[103387].result.1.txt
## ---
## 6139: 161212_Epi_Env_Screen_v3[103441].result.1.txt
## 6140: 161212_Epi_Env_Screen_v3[103441].result.1.txt
## 6141: 161212_Epi_Env_Screen_v3[103441].result.1.txt
## 6142: 161212_Epi_Env_Screen_v3[103441].result.1.txt
## 6143: 161212_Epi_Env_Screen_v3[103441].result.1.txt
## ScreenName ScreenID PlateName PlateID
## 1: 161212_Epi_Env_Screen_Batch1 2146 HTIF00034 2667
## 2: 161212_Epi_Env_Screen_Batch1 2146 HTIF00034 2667
## 3: 161212_Epi_Env_Screen_Batch1 2146 HTIF00034 2667
## 4: 161212_Epi_Env_Screen_Batch1 2146 HTIF00034 2667
## 5: 161212_Epi_Env_Screen_Batch1 2146 HTIF00034 2667
## ---
## 6139: 161212_Epi_Env_Screen_Batch1 2146 HTIF00058 2669
## 6140: 161212_Epi_Env_Screen_Batch1 2146 HTIF00058 2669
## 6141: 161212_Epi_Env_Screen_Batch1 2146 HTIF00058 2669
## 6142: 161212_Epi_Env_Screen_Batch1 2146 HTIF00058 2669
## 6143: 161212_Epi_Env_Screen_Batch1 2146 HTIF00058 2669
## MeasurementDate MeasurementID WellName Row Column Timepoint
## 1: 2016-12-13T02:51:57Z 2594 A1 1 1 1
## 2: 2016-12-13T02:51:57Z 2594 A2 1 2 1
## 3: 2016-12-13T02:51:57Z 2594 A3 1 3 1
## 4: 2016-12-13T02:51:57Z 2594 A4 1 4 1
## 5: 2016-12-13T02:51:57Z 2594 A5 1 5 1
## ---
## 6139: 2016-12-13T10:38:15Z 2596 P20 16 20 1
## 6140: 2016-12-13T10:38:15Z 2596 P21 16 21 1
## 6141: 2016-12-13T10:38:15Z 2596 P22 16 22 1
## 6142: 2016-12-13T10:38:15Z 2596 P23 16 23 1
## 6143: 2016-12-13T10:38:15Z 2596 P24 16 24 1
## Plane Nuclei - Number of Objects
## 1: 1 1134
## 2: 1 1565
## 3: 1 699
## 4: 1 1211
## 5: 1 1522
## ---
## 6139: 1 1422
## 6140: 1 1505
## 6141: 1 1515
## 6142: 1 31
## 6143: 1 415
## Nuclei - Nuclei Selected - Mean per Well
## 1: 0.7813051
## 2: 0.8089457
## 3: 0.7839771
## 4: 0.8191577
## 5: 0.8134034
## ---
## 6139: 0.8164557
## 6140: 0.8053156
## 6141: 0.7874587
## 6142: 0.6451613
## 6143: 0.7204819
## Nuclei Selected - Number of Objects
## 1: 886
## 2: 1266
## 3: 548
## 4: 992
## 5: 1238
## ---
## 6139: 1161
## 6140: 1212
## 6141: 1193
## 6142: 20
## 6143: 299
## Nuclei Selected - Nucleus Area [µm²] - Mean per Well
## 1: 190.6056
## 2: 179.5904
## 3: 205.4739
## 4: 187.1641
## 5: 171.9464
## ---
## 6139: 177.5044
## 6140: 179.6823
## 6141: 183.7231
## 6142: 139.3442
## 6143: 270.1453
## Nuclei Selected - Nucleus Roundness - Mean per Well
## 1: 0.9611200
## 2: 0.9686175
## 3: 0.9648931
## 4: 0.9644785
## 5: 0.9676596
## ---
## 6139: 0.9692354
## 6140: 0.9667362
## 6141: 0.9679271
## 6142: 0.8372388
## 6143: 0.9734971
## Nuclei Selected - Nucleus Width [µm] - Mean per Well
## 1: 12.507684
## 2: 12.415063
## 3: 13.170453
## 4: 12.498291
## 5: 12.152409
## ---
## 6139: 12.269038
## 6140: 12.320288
## 6141: 12.414162
## 6142: 9.639863
## 6143: 15.238819
## Nuclei Selected - Nucleus Length [µm] - Mean per Well
## 1: 17.66686
## 2: 16.94635
## 3: 18.41283
## 4: 17.43507
## 5: 16.44493
## ---
## 6139: 16.87936
## 6140: 16.96053
## 6141: 17.21266
## 6142: 16.76403
## 6143: 21.19805
## Nuclei Selected - Nucleus Ratio Width to Length - Mean per Well
## 1: 0.7127882
## 2: 0.7367895
## 3: 0.7215694
## 4: 0.7196473
## 5: 0.7410293
## ---
## 6139: 0.7305002
## 6140: 0.7299013
## 6141: 0.7244995
## 6142: 0.5926918
## 6143: 0.7228466
## Nuclei Selected - Intensity Nucleus Green Mean - Mean per Well
## 1: 330.8730
## 2: 259.4851
## 3: 365.1034
## 4: 298.1092
## 5: 262.2875
## ---
## 6139: 230.2500
## 6140: 211.9847
## 6141: 210.6042
## 6142: 488.5992
## 6143: 339.8211
## Nuclei Selected - Intensity Nucleus Red Mean - Mean per Well
## 1: 376.3647
## 2: 290.4706
## 3: 473.5955
## 4: 358.3449
## 5: 299.1064
## ---
## 6139: 285.7329
## 6140: 251.2956
## 6141: 258.6473
## 6142: 562.5378
## 6143: 547.8045
## Number of Analyzed Fields
## 1: 30
## 2: 30
## 3: 30
## 4: 30
## 5: 30
## ---
## 6139: 30
## 6140: 30
## 6141: 30
## 6142: 30
## 6143: 30
## Link
## 1: http://columbus.nci.nih.gov/browse/measurement/2594/well=1.1
## 2: http://columbus.nci.nih.gov/browse/measurement/2594/well=1.2
## 3: http://columbus.nci.nih.gov/browse/measurement/2594/well=1.3
## 4: http://columbus.nci.nih.gov/browse/measurement/2594/well=1.4
## 5: http://columbus.nci.nih.gov/browse/measurement/2594/well=1.5
## ---
## 6139: http://columbus.nci.nih.gov/browse/measurement/2596/well=16.20
## 6140: http://columbus.nci.nih.gov/browse/measurement/2596/well=16.21
## 6141: http://columbus.nci.nih.gov/browse/measurement/2596/well=16.22
## 6142: http://columbus.nci.nih.gov/browse/measurement/2596/well=16.23
## 6143: http://columbus.nci.nih.gov/browse/measurement/2596/well=16.24
Eliminate the mixing steps, i.e. the steps where the source plate barcode and the destination plate barcodes are the same.
eliminate_mix <- function(dt){
dt <- dt[!(Rack == SrcRack),]
}
dt_list <- llply(dt_list, eliminate_mix)
Separate the remaining data tables.
dt_r <- dt_list$r
dt_c <- dt_list$c
dt_d <- dt_list$d
dt_s <- dt_list$s
dt_ac <- dt_list$ac
rm(dt_list)
Rename Rack, SrcRack, Well and SrcWell according to the Janus protocol.
setnames(dt_c, c("SrcRack", "SrcWell", "Rack", "Well"),
c("Ambion_barcode", "Ambion_well", "Mother_barcode", "Mother_well"))
setnames(dt_d, c("SrcRack", "SrcWell", "Rack", "Well"),
c("Mother_barcode", "Mother_well", "Daughter_barcode", "Daughter_well"))
setnames(dt_s, c("SrcRack", "SrcWell", "Rack", "Well"),
c("Daughter_barcode", "Daughter_well", "Assay_barcode", "Assay_well"))
dt_c
## .id UserName TestId
## 1: Compression_00554.csv JANUS 554
## 2: Compression_00554.csv JANUS 554
## 3: Compression_00554.csv JANUS 554
## 4: Compression_00554.csv JANUS 554
## 5: Compression_00554.csv JANUS 554
## ---
## 2876: Compression_00570.csv JANUS 570
## 2877: Compression_00570.csv JANUS 570
## 2878: Compression_00570.csv JANUS 570
## 2879: Compression_00570.csv JANUS 570
## 2880: Compression_00570.csv JANUS 570
## TestName TestDateTime
## 1: STEP2_Compress_siRNAsource_96WP_to_384WP.pro 7/5/2016 10:18:14 AM
## 2: STEP2_Compress_siRNAsource_96WP_to_384WP.pro 7/5/2016 10:18:14 AM
## 3: STEP2_Compress_siRNAsource_96WP_to_384WP.pro 7/5/2016 10:18:14 AM
## 4: STEP2_Compress_siRNAsource_96WP_to_384WP.pro 7/5/2016 10:18:14 AM
## 5: STEP2_Compress_siRNAsource_96WP_to_384WP.pro 7/5/2016 10:18:14 AM
## ---
## 2876: STEP2_Compress_siRNAsource_96WP_to_384WP.pro 7/11/2016 3:36:13 PM
## 2877: STEP2_Compress_siRNAsource_96WP_to_384WP.pro 7/11/2016 3:36:13 PM
## 2878: STEP2_Compress_siRNAsource_96WP_to_384WP.pro 7/11/2016 3:36:13 PM
## 2879: STEP2_Compress_siRNAsource_96WP_to_384WP.pro 7/11/2016 3:36:13 PM
## 2880: STEP2_Compress_siRNAsource_96WP_to_384WP.pro 7/11/2016 3:36:13 PM
## OperationId OpTime ProcName Tip SampleId Mother_barcode
## 1: 961 215 Compress Plate MDT_1 1 SRC0001 HT00003
## 2: 962 215 Compress Plate MDT_1 2 SRC0002 HT00003
## 3: 963 215 Compress Plate MDT_1 3 SRC0003 HT00003
## 4: 964 215 Compress Plate MDT_1 4 SRC0004 HT00003
## 5: 965 215 Compress Plate MDT_1 5 SRC0005 HT00003
## ---
## 2876: 4508 578 Compress Plate MDT_1 92 SRC0380 HT00025
## 2877: 4509 578 Compress Plate MDT_1 93 SRC0381 HT00025
## 2878: 4510 578 Compress Plate MDT_1 94 SRC0382 HT00025
## 2879: 4511 578 Compress Plate MDT_1 95 SRC0383 HT00025
## 2880: 4512 578 Compress Plate MDT_1 96 SRC0384 HT00025
## Mother_well Ambion_barcode Ambion_well Volume ErrorText
## 1: A1 Q1_001CPF20390 A1 45 NA
## 2: C1 Q1_001CPF20390 B1 45 NA
## 3: E1 Q1_001CPF20390 C1 45 NA
## 4: G1 Q1_001CPF20390 D1 45 NA
## 5: I1 Q1_001CPF20390 E1 45 NA
## ---
## 2876: H24 CPF203AL D12 45 NA
## 2877: J24 CPF203AL E12 45 NA
## 2878: L24 CPF203AL F12 45 NA
## 2879: N24 CPF203AL G12 45 NA
## 2880: P24 CPF203AL H12 45 NA
## Response Text
## 1: NA
## 2: NA
## 3: NA
## 4: NA
## 5: NA
## ---
## 2876: NA
## 2877: NA
## 2878: NA
## 2879: NA
## 2880: NA
dt_d
## .id UserName TestId
## 1: Daughter_Plates_00559.csv JANUS 559
## 2: Daughter_Plates_00559.csv JANUS 559
## 3: Daughter_Plates_00559.csv JANUS 559
## 4: Daughter_Plates_00559.csv JANUS 559
## 5: Daughter_Plates_00559.csv JANUS 559
## ---
## 18428: Daughter_Plates_00574.csv JANUS 574
## 18429: Daughter_Plates_00574.csv JANUS 574
## 18430: Daughter_Plates_00574.csv JANUS 574
## 18431: Daughter_Plates_00574.csv JANUS 574
## 18432: Daughter_Plates_00574.csv JANUS 574
## TestName TestDateTime OperationId
## 1: STEP3_Make_3_Daughter_Plates.pro 7/5/2016 12:04:25 PM 769
## 2: STEP3_Make_3_Daughter_Plates.pro 7/5/2016 12:04:25 PM 770
## 3: STEP3_Make_3_Daughter_Plates.pro 7/5/2016 12:04:25 PM 771
## 4: STEP3_Make_3_Daughter_Plates.pro 7/5/2016 12:04:25 PM 772
## 5: STEP3_Make_3_Daughter_Plates.pro 7/5/2016 12:04:25 PM 773
## ---
## 18428: STEP3_Make_3_Daughter_Plates.pro 7/11/2016 4:19:21 PM 13436
## 18429: STEP3_Make_3_Daughter_Plates.pro 7/11/2016 4:19:21 PM 13437
## 18430: STEP3_Make_3_Daughter_Plates.pro 7/11/2016 4:19:21 PM 13438
## 18431: STEP3_Make_3_Daughter_Plates.pro 7/11/2016 4:19:21 PM 13439
## 18432: STEP3_Make_3_Daughter_Plates.pro 7/11/2016 4:19:21 PM 13440
## OpTime ProcName Tip SampleId Daughter_barcode
## 1: 155 siRNA Dilution Buffer 1 SRC0001 HT00008
## 2: 155 siRNA Dilution Buffer 2 SRC0002 HT00008
## 3: 155 siRNA Dilution Buffer 3 SRC0003 HT00008
## 4: 155 siRNA Dilution Buffer 4 SRC0004 HT00008
## 5: 155 siRNA Dilution Buffer 5 SRC0005 HT00008
## ---
## 18428: 527 Dilute Master Plate 380 SRC1148 HT00034
## 18429: 527 Dilute Master Plate 381 SRC1149 HT00034
## 18430: 527 Dilute Master Plate 382 SRC1150 HT00034
## 18431: 527 Dilute Master Plate 383 SRC1151 HT00034
## 18432: 527 Dilute Master Plate 384 SRC1152 HT00034
## Daughter_well Mother_barcode Mother_well Volume ErrorText
## 1: A1 Reservior_001 A1 27.6 NA
## 2: B1 Reservior_001 B1 27.6 NA
## 3: C1 Reservior_001 C1 27.6 NA
## 4: D1 Reservior_001 D1 27.6 NA
## 5: E1 Reservior_001 E1 27.6 NA
## ---
## 18428: L24 HT00025 L24 2.4 NA
## 18429: M24 HT00025 M24 2.4 NA
## 18430: N24 HT00025 N24 2.4 NA
## 18431: O24 HT00025 O24 2.4 NA
## 18432: P24 HT00025 P24 2.4 NA
## Response Text
## 1: NA
## 2: NA
## 3: NA
## 4: NA
## 5: NA
## ---
## 18428: NA
## 18429: NA
## 18430: NA
## 18431: NA
## 18432: NA
dt_s
## .id UserName TestId
## 1: Image_Plates_00800.csv JANUS 800
## 2: Image_Plates_00800.csv JANUS 800
## 3: Image_Plates_00800.csv JANUS 800
## 4: Image_Plates_00800.csv JANUS 800
## 5: Image_Plates_00800.csv JANUS 800
## ---
## 20732: Image_Plates_00807.csv JANUS 807
## 20733: Image_Plates_00807.csv JANUS 807
## 20734: Image_Plates_00807.csv JANUS 807
## 20735: Image_Plates_00807.csv JANUS 807
## 20736: Image_Plates_00807.csv JANUS 807
## TestName TestDateTime
## 1: STEP4auto_siRNA_Stamping_3_Image_Plates.pro 11/30/2016 12:18:17 PM
## 2: STEP4auto_siRNA_Stamping_3_Image_Plates.pro 11/30/2016 12:18:17 PM
## 3: STEP4auto_siRNA_Stamping_3_Image_Plates.pro 11/30/2016 12:18:17 PM
## 4: STEP4auto_siRNA_Stamping_3_Image_Plates.pro 11/30/2016 12:18:17 PM
## 5: STEP4auto_siRNA_Stamping_3_Image_Plates.pro 11/30/2016 12:18:17 PM
## ---
## 20732: STEP4auto_siRNA_Stamping_3_Image_Plates.pro 11/30/2016 1:03:57 PM
## 20733: STEP4auto_siRNA_Stamping_3_Image_Plates.pro 11/30/2016 1:03:57 PM
## 20734: STEP4auto_siRNA_Stamping_3_Image_Plates.pro 11/30/2016 1:03:57 PM
## 20735: STEP4auto_siRNA_Stamping_3_Image_Plates.pro 11/30/2016 1:03:57 PM
## 20736: STEP4auto_siRNA_Stamping_3_Image_Plates.pro 11/30/2016 1:03:57 PM
## OperationId OpTime ProcName Tip SampleId Assay_barcode
## 1: 769 131 Replicate Plate MDT_1 1 SRC0001 HTIF00034
## 2: 770 131 Replicate Plate MDT_1 2 SRC0002 HTIF00034
## 3: 771 131 Replicate Plate MDT_1 3 SRC0003 HTIF00034
## 4: 772 131 Replicate Plate MDT_1 4 SRC0004 HTIF00034
## 5: 773 131 Replicate Plate MDT_1 5 SRC0005 HTIF00034
## ---
## 20732: 2684 155 Replicate Plate MDT_1 380 SRC1148 HTIF00054
## 20733: 2685 155 Replicate Plate MDT_1 381 SRC1149 HTIF00054
## 20734: 2686 155 Replicate Plate MDT_1 382 SRC1150 HTIF00054
## 20735: 2687 155 Replicate Plate MDT_1 383 SRC1151 HTIF00054
## 20736: 2688 155 Replicate Plate MDT_1 384 SRC1152 HTIF00054
## Assay_well Daughter_barcode Daughter_well Volume ErrorText
## 1: A1 HT00009 A1 2 NA
## 2: B1 HT00009 B1 2 NA
## 3: C1 HT00009 C1 2 NA
## 4: D1 HT00009 D1 2 NA
## 5: E1 HT00009 E1 2 NA
## ---
## 20732: L24 HT00033 L24 2 NA
## 20733: M24 HT00033 M24 2 NA
## 20734: N24 HT00033 N24 2 NA
## 20735: O24 HT00033 O24 2 NA
## 20736: P24 HT00033 P24 2 NA
## Response Text
## 1: NA
## 2: NA
## 3: NA
## 4: NA
## 5: NA
## ---
## 20732: NA
## 20733: NA
## 20734: NA
## 20735: NA
## 20736: NA
Eliminate the Q1_001 string from some of the Ambion barcodes.
dt_c[, Ambion_barcode := str_replace(Ambion_barcode, ".*(CPF[0-9, A-Z]{5})", "\\1")]
## .id UserName TestId
## 1: Compression_00554.csv JANUS 554
## 2: Compression_00554.csv JANUS 554
## 3: Compression_00554.csv JANUS 554
## 4: Compression_00554.csv JANUS 554
## 5: Compression_00554.csv JANUS 554
## ---
## 2876: Compression_00570.csv JANUS 570
## 2877: Compression_00570.csv JANUS 570
## 2878: Compression_00570.csv JANUS 570
## 2879: Compression_00570.csv JANUS 570
## 2880: Compression_00570.csv JANUS 570
## TestName TestDateTime
## 1: STEP2_Compress_siRNAsource_96WP_to_384WP.pro 7/5/2016 10:18:14 AM
## 2: STEP2_Compress_siRNAsource_96WP_to_384WP.pro 7/5/2016 10:18:14 AM
## 3: STEP2_Compress_siRNAsource_96WP_to_384WP.pro 7/5/2016 10:18:14 AM
## 4: STEP2_Compress_siRNAsource_96WP_to_384WP.pro 7/5/2016 10:18:14 AM
## 5: STEP2_Compress_siRNAsource_96WP_to_384WP.pro 7/5/2016 10:18:14 AM
## ---
## 2876: STEP2_Compress_siRNAsource_96WP_to_384WP.pro 7/11/2016 3:36:13 PM
## 2877: STEP2_Compress_siRNAsource_96WP_to_384WP.pro 7/11/2016 3:36:13 PM
## 2878: STEP2_Compress_siRNAsource_96WP_to_384WP.pro 7/11/2016 3:36:13 PM
## 2879: STEP2_Compress_siRNAsource_96WP_to_384WP.pro 7/11/2016 3:36:13 PM
## 2880: STEP2_Compress_siRNAsource_96WP_to_384WP.pro 7/11/2016 3:36:13 PM
## OperationId OpTime ProcName Tip SampleId Mother_barcode
## 1: 961 215 Compress Plate MDT_1 1 SRC0001 HT00003
## 2: 962 215 Compress Plate MDT_1 2 SRC0002 HT00003
## 3: 963 215 Compress Plate MDT_1 3 SRC0003 HT00003
## 4: 964 215 Compress Plate MDT_1 4 SRC0004 HT00003
## 5: 965 215 Compress Plate MDT_1 5 SRC0005 HT00003
## ---
## 2876: 4508 578 Compress Plate MDT_1 92 SRC0380 HT00025
## 2877: 4509 578 Compress Plate MDT_1 93 SRC0381 HT00025
## 2878: 4510 578 Compress Plate MDT_1 94 SRC0382 HT00025
## 2879: 4511 578 Compress Plate MDT_1 95 SRC0383 HT00025
## 2880: 4512 578 Compress Plate MDT_1 96 SRC0384 HT00025
## Mother_well Ambion_barcode Ambion_well Volume ErrorText
## 1: A1 CPF20390 A1 45 NA
## 2: C1 CPF20390 B1 45 NA
## 3: E1 CPF20390 C1 45 NA
## 4: G1 CPF20390 D1 45 NA
## 5: I1 CPF20390 E1 45 NA
## ---
## 2876: H24 CPF203AL D12 45 NA
## 2877: J24 CPF203AL E12 45 NA
## 2878: L24 CPF203AL F12 45 NA
## 2879: N24 CPF203AL G12 45 NA
## 2880: P24 CPF203AL H12 45 NA
## Response Text
## 1: NA
## 2: NA
## 3: NA
## 4: NA
## 5: NA
## ---
## 2876: NA
## 2877: NA
## 2878: NA
## 2879: NA
## 2880: NA
Assign a plate number to the Mother plates. For now this is a manual operation. The hEpigenetics library corresponds to plates 1-5, whereas the hNucEnvelope library corresponds to plates 6 to 8.
run_number <- data.table(Mother_barcode = c("HT00003",
"HT00004",
"HT00005",
"HT00006",
"HT00007",
"HT00023",
"HT00024",
"HT00025"),
Plate_Number = 1:8)
setkey(run_number, Mother_barcode)
setkey(dt_c, Mother_barcode)
dt_c <- dt_c[run_number]
dt_c
## .id UserName TestId
## 1: Compression_00554.csv JANUS 554
## 2: Compression_00554.csv JANUS 554
## 3: Compression_00554.csv JANUS 554
## 4: Compression_00554.csv JANUS 554
## 5: Compression_00554.csv JANUS 554
## ---
## 2876: Compression_00570.csv JANUS 570
## 2877: Compression_00570.csv JANUS 570
## 2878: Compression_00570.csv JANUS 570
## 2879: Compression_00570.csv JANUS 570
## 2880: Compression_00570.csv JANUS 570
## TestName TestDateTime
## 1: STEP2_Compress_siRNAsource_96WP_to_384WP.pro 7/5/2016 10:18:14 AM
## 2: STEP2_Compress_siRNAsource_96WP_to_384WP.pro 7/5/2016 10:18:14 AM
## 3: STEP2_Compress_siRNAsource_96WP_to_384WP.pro 7/5/2016 10:18:14 AM
## 4: STEP2_Compress_siRNAsource_96WP_to_384WP.pro 7/5/2016 10:18:14 AM
## 5: STEP2_Compress_siRNAsource_96WP_to_384WP.pro 7/5/2016 10:18:14 AM
## ---
## 2876: STEP2_Compress_siRNAsource_96WP_to_384WP.pro 7/11/2016 3:36:13 PM
## 2877: STEP2_Compress_siRNAsource_96WP_to_384WP.pro 7/11/2016 3:36:13 PM
## 2878: STEP2_Compress_siRNAsource_96WP_to_384WP.pro 7/11/2016 3:36:13 PM
## 2879: STEP2_Compress_siRNAsource_96WP_to_384WP.pro 7/11/2016 3:36:13 PM
## 2880: STEP2_Compress_siRNAsource_96WP_to_384WP.pro 7/11/2016 3:36:13 PM
## OperationId OpTime ProcName Tip SampleId Mother_barcode
## 1: 961 215 Compress Plate MDT_1 1 SRC0001 HT00003
## 2: 962 215 Compress Plate MDT_1 2 SRC0002 HT00003
## 3: 963 215 Compress Plate MDT_1 3 SRC0003 HT00003
## 4: 964 215 Compress Plate MDT_1 4 SRC0004 HT00003
## 5: 965 215 Compress Plate MDT_1 5 SRC0005 HT00003
## ---
## 2876: 4508 578 Compress Plate MDT_1 92 SRC0380 HT00025
## 2877: 4509 578 Compress Plate MDT_1 93 SRC0381 HT00025
## 2878: 4510 578 Compress Plate MDT_1 94 SRC0382 HT00025
## 2879: 4511 578 Compress Plate MDT_1 95 SRC0383 HT00025
## 2880: 4512 578 Compress Plate MDT_1 96 SRC0384 HT00025
## Mother_well Ambion_barcode Ambion_well Volume ErrorText
## 1: A1 CPF20390 A1 45 NA
## 2: C1 CPF20390 B1 45 NA
## 3: E1 CPF20390 C1 45 NA
## 4: G1 CPF20390 D1 45 NA
## 5: I1 CPF20390 E1 45 NA
## ---
## 2876: H24 CPF203AL D12 45 NA
## 2877: J24 CPF203AL E12 45 NA
## 2878: L24 CPF203AL F12 45 NA
## 2879: N24 CPF203AL G12 45 NA
## 2880: P24 CPF203AL H12 45 NA
## Response Text Plate_Number
## 1: NA 1
## 2: NA 1
## 3: NA 1
## 4: NA 1
## 5: NA 1
## ---
## 2876: NA 8
## 2877: NA 8
## 2878: NA 8
## 2879: NA 8
## 2880: NA 8
Eliminate the water addition steps to the Daughter plates.
dt_d <- dt_d[!(Mother_barcode == "Reservior_001"),]
Eliminate the Daughter1_001 string from some of the Daughter barcodes in the Daughter plates.
dt_d[, Daughter_barcode := str_replace(Daughter_barcode, ".*(HT[0-9]{5})", "\\1")]
## .id UserName TestId
## 1: Daughter_Plates_00559.csv JANUS 559
## 2: Daughter_Plates_00559.csv JANUS 559
## 3: Daughter_Plates_00559.csv JANUS 559
## 4: Daughter_Plates_00559.csv JANUS 559
## 5: Daughter_Plates_00559.csv JANUS 559
## ---
## 9212: Daughter_Plates_00574.csv JANUS 574
## 9213: Daughter_Plates_00574.csv JANUS 574
## 9214: Daughter_Plates_00574.csv JANUS 574
## 9215: Daughter_Plates_00574.csv JANUS 574
## 9216: Daughter_Plates_00574.csv JANUS 574
## TestName TestDateTime OperationId
## 1: STEP3_Make_3_Daughter_Plates.pro 7/5/2016 12:04:25 PM 3841
## 2: STEP3_Make_3_Daughter_Plates.pro 7/5/2016 12:04:25 PM 3842
## 3: STEP3_Make_3_Daughter_Plates.pro 7/5/2016 12:04:25 PM 3843
## 4: STEP3_Make_3_Daughter_Plates.pro 7/5/2016 12:04:25 PM 3844
## 5: STEP3_Make_3_Daughter_Plates.pro 7/5/2016 12:04:25 PM 3845
## ---
## 9212: STEP3_Make_3_Daughter_Plates.pro 7/11/2016 4:19:21 PM 13436
## 9213: STEP3_Make_3_Daughter_Plates.pro 7/11/2016 4:19:21 PM 13437
## 9214: STEP3_Make_3_Daughter_Plates.pro 7/11/2016 4:19:21 PM 13438
## 9215: STEP3_Make_3_Daughter_Plates.pro 7/11/2016 4:19:21 PM 13439
## 9216: STEP3_Make_3_Daughter_Plates.pro 7/11/2016 4:19:21 PM 13440
## OpTime ProcName Tip SampleId Daughter_barcode
## 1: 293 Dilute Master Plate 1 SRC0001 HT00008
## 2: 293 Dilute Master Plate 2 SRC0002 HT00008
## 3: 293 Dilute Master Plate 3 SRC0003 HT00008
## 4: 293 Dilute Master Plate 4 SRC0004 HT00008
## 5: 293 Dilute Master Plate 5 SRC0005 HT00008
## ---
## 9212: 527 Dilute Master Plate 380 SRC1148 HT00034
## 9213: 527 Dilute Master Plate 381 SRC1149 HT00034
## 9214: 527 Dilute Master Plate 382 SRC1150 HT00034
## 9215: 527 Dilute Master Plate 383 SRC1151 HT00034
## 9216: 527 Dilute Master Plate 384 SRC1152 HT00034
## Daughter_well Mother_barcode Mother_well Volume ErrorText
## 1: A1 HT00003 A1 2.4 NA
## 2: B1 HT00003 B1 2.4 NA
## 3: C1 HT00003 C1 2.4 NA
## 4: D1 HT00003 D1 2.4 NA
## 5: E1 HT00003 E1 2.4 NA
## ---
## 9212: L24 HT00025 L24 2.4 NA
## 9213: M24 HT00025 M24 2.4 NA
## 9214: N24 HT00025 N24 2.4 NA
## 9215: O24 HT00025 O24 2.4 NA
## 9216: P24 HT00025 P24 2.4 NA
## Response Text
## 1: NA
## 2: NA
## 3: NA
## 4: NA
## 5: NA
## ---
## 9212: NA
## 9213: NA
## 9214: NA
## 9215: NA
## 9216: NA
Join well annotations from the Thernmo FISHER table with the logs obtained from the Compression step (Reformatting from 96- to 384-well format).
setkey(dt_gl_filter, Ambion_barcode, Ambion_well)
setkey(dt_c, Ambion_barcode, Ambion_well)
names1 <- quote(list(Plate_Number,
Mother_barcode,
Mother_well,
`Plate Name`,
Ambion_barcode,
Ambion_well,
Row,
Col,
`RefSeq Accession Number`,
`Gene Symbol`,
`Full Gene Name`,
`Gene ID`,
`siRNA ID`,
`Exon(s) Targeted`,
`Sense siRNA Sequence`,
`Antisense siRNA Sequence`,
Validated))
join1 <- dt_gl_filter[dt_c, eval(names1), nomatch = 0]
dt_plate_c <- join1[, .SD[1, .(Ambion_barcode, Mother_barcode)],
by = .(Ambion_barcode, Mother_barcode)]
dt_plate_c
## Ambion_barcode Mother_barcode Ambion_barcode Mother_barcode
## 1: CPF20390 HT00003 CPF20390 HT00003
## 2: CPF20391 HT00003 CPF20391 HT00003
## 3: CPF20392 HT00003 CPF20392 HT00003
## 4: CPF20393 HT00003 CPF20393 HT00003
## 5: CPF20394 HT00006 CPF20394 HT00006
## 6: CPF20395 HT00006 CPF20395 HT00006
## 7: CPF20396 HT00004 CPF20396 HT00004
## 8: CPF20397 HT00004 CPF20397 HT00004
## 9: CPF20398 HT00004 CPF20398 HT00004
## 10: CPF20399 HT00004 CPF20399 HT00004
## 11: CPF2039A HT00006 CPF2039A HT00006
## 12: CPF2039B HT00007 CPF2039B HT00007
## 13: CPF2039C HT00005 CPF2039C HT00005
## 14: CPF2039D HT00005 CPF2039D HT00005
## 15: CPF2039E HT00005 CPF2039E HT00005
## 16: CPF2039F HT00005 CPF2039F HT00005
## 17: CPF2039G HT00007 CPF2039G HT00007
## 18: CPF2039H HT00007 CPF2039H HT00007
## 19: CPF203AA HT00023 CPF203AA HT00023
## 20: CPF203AB HT00023 CPF203AB HT00023
## 21: CPF203AC HT00023 CPF203AC HT00023
## 22: CPF203AD HT00023 CPF203AD HT00023
## 23: CPF203AE HT00024 CPF203AE HT00024
## 24: CPF203AF HT00024 CPF203AF HT00024
## 25: CPF203AG HT00024 CPF203AG HT00024
## 26: CPF203AH HT00024 CPF203AH HT00024
## 27: CPF203AI HT00025 CPF203AI HT00025
## 28: CPF203AJ HT00025 CPF203AJ HT00025
## 29: CPF203AK HT00025 CPF203AK HT00025
## 30: CPF203AL HT00025 CPF203AL HT00025
## Ambion_barcode Mother_barcode Ambion_barcode Mother_barcode
join1
## Plate_Number Mother_barcode Mother_well
## 1: 1 HT00003 A1
## 2: 1 HT00003 A19
## 3: 1 HT00003 A21
## 4: 1 HT00003 A3
## 5: 1 HT00003 A5
## ---
## 2589: 8 HT00025 P2
## 2590: 8 HT00025 P4
## 2591: 8 HT00025 P6
## 2592: 8 HT00025 P8
## 2593: 8 HT00025 P10
## Plate Name Ambion_barcode Ambion_well Row Col
## 1: Hm Epigenetic siRNA Lib V1-A1-1 CPF20390 A1 A 1
## 2: Hm Epigenetic siRNA Lib V1-A1-1 CPF20390 A10 A 10
## 3: Hm Epigenetic siRNA Lib V1-A1-1 CPF20390 A11 A 11
## 4: Hm Epigenetic siRNA Lib V1-A1-1 CPF20390 A2 A 2
## 5: Hm Epigenetic siRNA Lib V1-A1-1 CPF20390 A3 A 3
## ---
## 2589: 0006151435-C4 CPF203AL H1 H 1
## 2590: 0006151435-C4 CPF203AL H2 H 2
## 2591: 0006151435-C4 CPF203AL H3 H 3
## 2592: 0006151435-C4 CPF203AL H4 H 4
## 2593: 0006151435-C4 CPF203AL H5 H 5
## RefSeq Accession Number Gene Symbol
## 1: NM_000383 AIRE
## 2: NM_001786 CDC2
## 3: NM_001798 CDK2
## 4: NM_004041 ARRB1
## 5: NM_000051 ATM
## ---
## 2589: NM_016002 SCCPDH
## 2590: NM_002883 RANGAP1
## 2591: NM_017866 TMEM70
## 2592: NM_004318 ASPH
## 2593: NM_181783 TMTC3
## Full Gene Name Gene ID
## 1: autoimmune regulator 326
## 2: cell division cycle 2, G1 to S and G2 to M 983
## 3: cyclin-dependent kinase 2 1017
## 4: arrestin, beta 1 408
## 5: ataxia telangiectasia mutated 472
## ---
## 2589: saccharopine dehydrogenase (putative) 51097
## 2590: Ran GTPase activating protein 1 5905
## 2591: transmembrane protein 70 54968
## 2592: aspartate beta-hydroxylase 444
## 2593: transmembrane and tetratricopeptide repeat containing 3 160418
## siRNA ID Exon(s) Targeted Sense siRNA Sequence
## 1: s1439 Not Determined GCAUGGACACGACUCUUGUtt
## 2: s464 Not Determined GGUUAUAUCUCAUCUUUGAtt
## 3: s204 Not Determined CGGAGCUUGUUAUCGCAAAtt
## 4: s1623 8,8 GGAGAUCUAUUACCAUGGAtt
## 5: s1710 35 GCUGUUACCUGUUUGAAAAtt
## ---
## 2589: s27421 4 GAUCUGGGAGUAAUAUAUAtt
## 2590: s11778 6 GAAUGUCACCGGAAAUCCAtt
## 2591: s29880 Not Determined CACUGUUAGUUAAUCCAGUtt
## 2592: s270 Not Determined GUUUGAUCUUGUUGACUAUtt
## 2593: s46199 Not Determined GAAAACGACUUCUAAGUUAtt
## Antisense siRNA Sequence Validated
## 1: ACAAGAGUCGUGUCCAUGCcg NA
## 2: UCAAAGAUGAGAUAUAACCtg NA
## 3: UUUGCGAUAACAAGCUCCGtc NA
## 4: UCCAUGGUAAUAGAUCUCCtt NA
## 5: UUUUCAAACAGGUAACAGCtg NA
## ---
## 2589: UAUAUAUUACUCCCAGAUCtg NA
## 2590: UGGAUUUCCGGUGACAUUCgg NA
## 2591: ACUGGAUUAACUAACAGUGat NA
## 2592: AUAGUCAACAAGAUCAAACca NA
## 2593: UAACUUAGAAGUCGUUUUCta NA
Join the resulting table (join1) with the logs from the stamping plates step (Generation of the diluted daughter plates).
setkey(join1, Mother_barcode, Mother_well)
setkey(dt_d, Mother_barcode, Mother_well)
names2 <- quote(list(Plate_Number,
Daughter_barcode,
Daughter_well,
Mother_barcode,
Mother_well,
`Plate Name`,
Ambion_barcode,
Ambion_well,
Row,
Col,
`RefSeq Accession Number`,
`Gene Symbol`,
`Full Gene Name`,
`Gene ID`,
`siRNA ID`,
`Exon(s) Targeted`,
`Sense siRNA Sequence`,
`Antisense siRNA Sequence`,
Validated))
join2 <- dt_d[join1, eval(names2), nomatch = 0]
join2
## Plate_Number Daughter_barcode Daughter_well Mother_barcode
## 1: 1 HT00008 A1 HT00003
## 2: 1 HT00009 A1 HT00003
## 3: 1 HT00010 A1 HT00003
## 4: 1 HT00008 A10 HT00003
## 5: 1 HT00009 A10 HT00003
## ---
## 7775: 8 HT00033 P8 HT00025
## 7776: 8 HT00034 P8 HT00025
## 7777: 8 HT00032 P9 HT00025
## 7778: 8 HT00033 P9 HT00025
## 7779: 8 HT00034 P9 HT00025
## Mother_well Plate Name Ambion_barcode
## 1: A1 Hm Epigenetic siRNA Lib V1-A1-1 CPF20390
## 2: A1 Hm Epigenetic siRNA Lib V1-A1-1 CPF20390
## 3: A1 Hm Epigenetic siRNA Lib V1-A1-1 CPF20390
## 4: A10 Hm Epigenetic siRNA Lib V1-A1-2 CPF20391
## 5: A10 Hm Epigenetic siRNA Lib V1-A1-2 CPF20391
## ---
## 7775: P8 0006151435-C4 CPF203AL
## 7776: P8 0006151435-C4 CPF203AL
## 7777: P9 0006151435-C3 CPF203AK
## 7778: P9 0006151435-C3 CPF203AK
## 7779: P9 0006151435-C3 CPF203AK
## Ambion_well Row Col RefSeq Accession Number Gene Symbol
## 1: A1 A 1 NM_000383 AIRE
## 2: A1 A 1 NM_000383 AIRE
## 3: A1 A 1 NM_000383 AIRE
## 4: A5 A 5 NM_004187 JARID1C
## 5: A5 A 5 NM_004187 JARID1C
## ---
## 7775: H4 H 4 NM_004318 ASPH
## 7776: H4 H 4 NM_004318 ASPH
## 7777: H5 H 5 NM_002662 PLD1
## 7778: H5 H 5 NM_002662 PLD1
## 7779: H5 H 5 NM_002662 PLD1
## Full Gene Name Gene ID siRNA ID
## 1: autoimmune regulator 326 s1439
## 2: autoimmune regulator 326 s1439
## 3: autoimmune regulator 326 s1439
## 4: jumonji, AT rich interactive domain 1C 8242 s15748
## 5: jumonji, AT rich interactive domain 1C 8242 s15748
## ---
## 7775: aspartate beta-hydroxylase 444 s270
## 7776: aspartate beta-hydroxylase 444 s270
## 7777: phospholipase D1, phosphatidylcholine-specific 5337 s10639
## 7778: phospholipase D1, phosphatidylcholine-specific 5337 s10639
## 7779: phospholipase D1, phosphatidylcholine-specific 5337 s10639
## Exon(s) Targeted Sense siRNA Sequence Antisense siRNA Sequence
## 1: Not Determined GCAUGGACACGACUCUUGUtt ACAAGAGUCGUGUCCAUGCcg
## 2: Not Determined GCAUGGACACGACUCUUGUtt ACAAGAGUCGUGUCCAUGCcg
## 3: Not Determined GCAUGGACACGACUCUUGUtt ACAAGAGUCGUGUCCAUGCcg
## 4: Not Determined CAGACGAGAGUGAAACUGAtt UCAGUUUCACUCUCGUCUGgg
## 5: Not Determined CAGACGAGAGUGAAACUGAtt UCAGUUUCACUCUCGUCUGgg
## ---
## 7775: Not Determined GUUUGAUCUUGUUGACUAUtt AUAGUCAACAAGAUCAAACca
## 7776: Not Determined GUUUGAUCUUGUUGACUAUtt AUAGUCAACAAGAUCAAACca
## 7777: Not Determined GGCACUAUAUCUAUAUCGAtt UCGAUAUAGAUAUAGUGCCtg
## 7778: Not Determined GGCACUAUAUCUAUAUCGAtt UCGAUAUAGAUAUAGUGCCtg
## 7779: Not Determined GGCACUAUAUCUAUAUCGAtt UCGAUAUAGAUAUAGUGCCtg
## Validated
## 1: NA
## 2: NA
## 3: NA
## 4: NA
## 5: NA
## ---
## 7775: NA
## 7776: NA
## 7777: NA
## 7778: NA
## 7779: NA
Join the resulting table (join2) with the logs from the spotting plates step (Generation of the assay-ready imaging plates).
setkey(join2, Daughter_barcode, Daughter_well)
setkey(dt_s, Daughter_barcode, Daughter_well)
names3 <- quote(list(Plate_Number,
Assay_barcode, Assay_well,
Daughter_barcode,
Daughter_well,
Mother_barcode,
Mother_well,
`Plate Name`,
Ambion_barcode,
Ambion_well,
Row,
Col,
`RefSeq Accession Number`,
`Gene Symbol`,
`Full Gene Name`,
`Gene ID`,
`siRNA ID`,
`Exon(s) Targeted`,
`Sense siRNA Sequence`,
`Antisense siRNA Sequence`,
Validated))
join3 <- dt_s[join2, eval(names3), nomatch = 0]
join3
## Plate_Number Assay_barcode Assay_well Daughter_barcode
## 1: 1 HTIF00034 A1 HT00009
## 2: 1 HTIF00035 A1 HT00009
## 3: 1 HTIF00036 A1 HT00009
## 4: 1 HTIF00034 A1 HT00009
## 5: 1 HTIF00035 A1 HT00009
## ---
## 17630: 8 HTIF00056 P9 HT00033
## 17631: 8 HTIF00057 P9 HT00033
## 17632: 8 HTIF00052 P9 HT00033
## 17633: 8 HTIF00053 P9 HT00033
## 17634: 8 HTIF00054 P9 HT00033
## Daughter_well Mother_barcode Mother_well
## 1: A1 HT00003 A1
## 2: A1 HT00003 A1
## 3: A1 HT00003 A1
## 4: A1 HT00003 A1
## 5: A1 HT00003 A1
## ---
## 17630: P9 HT00025 P9
## 17631: P9 HT00025 P9
## 17632: P9 HT00025 P9
## 17633: P9 HT00025 P9
## 17634: P9 HT00025 P9
## Plate Name Ambion_barcode Ambion_well Row Col
## 1: Hm Epigenetic siRNA Lib V1-A1-1 CPF20390 A1 A 1
## 2: Hm Epigenetic siRNA Lib V1-A1-1 CPF20390 A1 A 1
## 3: Hm Epigenetic siRNA Lib V1-A1-1 CPF20390 A1 A 1
## 4: Hm Epigenetic siRNA Lib V1-A1-1 CPF20390 A1 A 1
## 5: Hm Epigenetic siRNA Lib V1-A1-1 CPF20390 A1 A 1
## ---
## 17630: 0006151435-C3 CPF203AK H5 H 5
## 17631: 0006151435-C3 CPF203AK H5 H 5
## 17632: 0006151435-C3 CPF203AK H5 H 5
## 17633: 0006151435-C3 CPF203AK H5 H 5
## 17634: 0006151435-C3 CPF203AK H5 H 5
## RefSeq Accession Number Gene Symbol
## 1: NM_000383 AIRE
## 2: NM_000383 AIRE
## 3: NM_000383 AIRE
## 4: NM_000383 AIRE
## 5: NM_000383 AIRE
## ---
## 17630: NM_002662 PLD1
## 17631: NM_002662 PLD1
## 17632: NM_002662 PLD1
## 17633: NM_002662 PLD1
## 17634: NM_002662 PLD1
## Full Gene Name Gene ID siRNA ID
## 1: autoimmune regulator 326 s1439
## 2: autoimmune regulator 326 s1439
## 3: autoimmune regulator 326 s1439
## 4: autoimmune regulator 326 s1439
## 5: autoimmune regulator 326 s1439
## ---
## 17630: phospholipase D1, phosphatidylcholine-specific 5337 s10639
## 17631: phospholipase D1, phosphatidylcholine-specific 5337 s10639
## 17632: phospholipase D1, phosphatidylcholine-specific 5337 s10639
## 17633: phospholipase D1, phosphatidylcholine-specific 5337 s10639
## 17634: phospholipase D1, phosphatidylcholine-specific 5337 s10639
## Exon(s) Targeted Sense siRNA Sequence Antisense siRNA Sequence
## 1: Not Determined GCAUGGACACGACUCUUGUtt ACAAGAGUCGUGUCCAUGCcg
## 2: Not Determined GCAUGGACACGACUCUUGUtt ACAAGAGUCGUGUCCAUGCcg
## 3: Not Determined GCAUGGACACGACUCUUGUtt ACAAGAGUCGUGUCCAUGCcg
## 4: Not Determined GCAUGGACACGACUCUUGUtt ACAAGAGUCGUGUCCAUGCcg
## 5: Not Determined GCAUGGACACGACUCUUGUtt ACAAGAGUCGUGUCCAUGCcg
## ---
## 17630: Not Determined GGCACUAUAUCUAUAUCGAtt UCGAUAUAGAUAUAGUGCCtg
## 17631: Not Determined GGCACUAUAUCUAUAUCGAtt UCGAUAUAGAUAUAGUGCCtg
## 17632: Not Determined GGCACUAUAUCUAUAUCGAtt UCGAUAUAGAUAUAGUGCCtg
## 17633: Not Determined GGCACUAUAUCUAUAUCGAtt UCGAUAUAGAUAUAGUGCCtg
## 17634: Not Determined GGCACUAUAUCUAUAUCGAtt UCGAUAUAGAUAUAGUGCCtg
## Validated
## 1: NA
## 2: NA
## 3: NA
## 4: NA
## 5: NA
## ---
## 17630: NA
## 17631: NA
## 17632: NA
## 17633: NA
## 17634: NA
Make the annotation file for cellHTS2, name it Annotation.txt and save it to the in the cellHTS2_input_Nuclei Selected - Nucleus Area [µm²] - Mean per Well directory.
dt_annotation <- join1[, list(Plate_Number, Mother_well, `Gene ID`, `Gene Symbol`, `siRNA ID`)]
# Rename variables according to cellHTS2 specifications
setnames(dt_annotation, c("Plate_Number", "Mother_well", "Gene ID", "Gene Symbol", "siRNA ID"),
c("Plate", "Well", "GeneID", "GeneSymbol", "siRNAID"))
# Change Well string according to cellHTS2 specifications
dt_annotation[, Well := sprintf("%s%02d",
str_extract(Well, "\\w"),
as.numeric(str_extract(Well, "\\d+")))][order(Plate, Well)]
## Plate Well GeneID GeneSymbol siRNAID
## 1: 1 A01 326 AIRE s1439
## 2: 1 A02 8019 BRD3 s15546
## 3: 1 A03 408 ARRB1 s1623
## 4: 1 A04 8085 MLL2 s15604
## 5: 1 A05 472 ATM s1710
## ---
## 2589: 8 P13 931 MS4A1 s2608
## 2590: 8 P15 144404 TMEM120B s44650
## 2591: 8 P17 10051 SMC4 s19540
## 2592: 8 P19 147991 DPY19L3 s45103
## 2593: 8 P21 11284 PNKP s22288
write.table(dt_annotation,
file = paste0(HTS2_input_dir, "/Annotation.txt"),
quote = FALSE,
sep = "\t",
na = "NA",
row.names = FALSE,
col.names = TRUE)
Make the configuration file Plateconf.txt for cellHTS2 according to the documentation specifications:
The software expects this to be a rectangular table in a tabulator delimited text file, with mandatory columns Plate, Well, Content, plus two additional header lines that give the total number of wells and plates (see Table ?? for an example). The content of this file (except the two header lines) are stored in slot plateConf of x. As the name suggests, the Content column provides the content of each well in the plate (here referred to as the well annotation). Mainly, this annotation falls into four categories: empty wells, wells targeting genes of interest, control wells, and wells containing other things that do not fit in the previous categories. The first two types of wells should be indicated in the Content column of the plate configuration file by empty and sample, respectively, while the last type of wells should be indicated by other. The designation for the control wells in the Content column is more flexible. By default, the software expects them to be indicated by pos (for positive controls), or neg (for negative controls). However, other names are allowed, given that they are specified by the user whenever necessary (for example, when calling the writeReport function). This versatility for the control wells’ annotation is justified by the fact that, sometimes, multiple positive and/or negative controls can be employed in a given screen, making it useful to give different names to the distinct controls in the Content column. More- over, this versatility might be required in multi-channel screens for which we frequently have reporter-specific controls. The Well column contains the name of each well of the plate in alphanu- meric format (in this case, A01 to P24), while column Plate gives the plate number (1, 2, …). These two columns are also allowed to contain regular expressions. In the plate configuration file, each well and plate should be covered by a rule, and in case of multiple definitions only the last one is considered. For example, in the file shown in Table ??, the rule specified by the first line after the column header indicates that all of the wells in each of the 57 assay plate contain “sample”. However, a following rule indicate that the content of wells A01, A02 and B01 and B02 differ from “sample”, containing other material (in this case, “other” and controls). Note that the well annotations mentioned above are used by the software in the normalization, quality control, and gene selection calculations. Data from wells that are annotated as empty are ignored, i. e. they are set to NA.
The configuration file tells cellHTS2 where the samples, controls and empty wells are on each plate. The first 8 lines of the file (Headers and controls positions) are hardcoded for now. The samples layout are is appended from the processed information obtained from the Janus logs.
line1 <- "Wells: 384"
line2 <- "Plates: 8"
line3 <- "Plate\tWell\tContent"
line4 <- "*\t*\tempty"
odd_rows <- paste(LETTERS[seq(1, 16, by = 2)], collapse = ",")
even_rows <- paste(LETTERS[seq(2, 16, by = 2)], collapse = ",")
line5 <- paste0("*\t[", odd_rows, "]23\tneg") # siNT in Column 23 odd rows (A, C, ..., O)
line6 <- paste0("*\t[", even_rows, "]23\tsiKiller") # allSTAR killer in Column 23 even rows (B, D, ..., P)
line7 <- paste0("*\t[", odd_rows, "]24\tsiLMNB") # siLMNB killer in Column 24 odd rows (A, C, ..., O)
line8 <- paste0("*\t[", even_rows, "]24\tpos") # siSYNE2 in Column 24 even rows (B, D, ..., P)
header <- c(line1, line2, line3, line4, line5, line6, line7, line8)
header
## [1] "Wells: 384" "Plates: 8"
## [3] "Plate\tWell\tContent" "*\t*\tempty"
## [5] "*\t[A,C,E,G,I,K,M,O]23\tneg" "*\t[B,D,F,H,J,L,N,P]23\tsiKiller"
## [7] "*\t[A,C,E,G,I,K,M,O]24\tsiLMNB" "*\t[B,D,F,H,J,L,N,P]24\tpos"
# Write header to file
conf <- file(paste0(HTS2_input_dir, "/Plateconf.txt"), "wr")
writeLines(header, conf)
close(conf)
The samples layout is appended from the processed information obtained from the Janus logs and is written out to the the cellHTS2_input_Nuclei Selected - Nucleus Area [µm²] - Mean per Well directory.
dt_config <- dt_annotation[, .(Plate = Plate,
Well = Well,
Content = "sample")]
write.table(dt_config,
paste0(HTS2_input_dir, "/Plateconf.txt"),
append = T, # append
quote = F,
col.names = F,
row.names = F,
sep = "\t")
Generate a Master_Barcode_List.txt file and save it in the working directory (i.e. the same directory where the .Rmd file is). According to the cellHTS2 specifications, the file should have these columns: PlateName (The plate barcode, as determined in the Janus log files, a string), Plate (The plate number in the library, a number), Replicate (Self explanatory, a number) and Batch (if the experiment or replicate was run in different batches, not necessary). This file contains the information on which plates have been analyzed, and on what the internal organization of the run is. The Master_Barcode_List.txt file is used to generate the measurement files to be read by cellHTS2.
master_plate <- fread("Master_Barcode_List.txt")
# Join the master plate information to the Columbus analysis results based on the plate barcode
setkey(master_plate, PlateName)
setkey(dt_col, PlateName)
dt_col <- dt_col[master_plate, nomatch = 0]
dt_col
## .id
## 1: 161212_Epi_Env_Screen_v3[103387].result.1.txt
## 2: 161212_Epi_Env_Screen_v3[103387].result.1.txt
## 3: 161212_Epi_Env_Screen_v3[103387].result.1.txt
## 4: 161212_Epi_Env_Screen_v3[103387].result.1.txt
## 5: 161212_Epi_Env_Screen_v3[103387].result.1.txt
## ---
## 6139: 161212_Epi_Env_Screen_v3[103438].result.1.txt
## 6140: 161212_Epi_Env_Screen_v3[103438].result.1.txt
## 6141: 161212_Epi_Env_Screen_v3[103438].result.1.txt
## 6142: 161212_Epi_Env_Screen_v3[103438].result.1.txt
## 6143: 161212_Epi_Env_Screen_v3[103438].result.1.txt
## ScreenName ScreenID PlateName PlateID
## 1: 161212_Epi_Env_Screen_Batch1 2146 HTIF00034 2667
## 2: 161212_Epi_Env_Screen_Batch1 2146 HTIF00034 2667
## 3: 161212_Epi_Env_Screen_Batch1 2146 HTIF00034 2667
## 4: 161212_Epi_Env_Screen_Batch1 2146 HTIF00034 2667
## 5: 161212_Epi_Env_Screen_Batch1 2146 HTIF00034 2667
## ---
## 6139: 161219_Epi_Env_Screen_Batch2 2152 HTIF00059 2704
## 6140: 161219_Epi_Env_Screen_Batch2 2152 HTIF00059 2704
## 6141: 161219_Epi_Env_Screen_Batch2 2152 HTIF00059 2704
## 6142: 161219_Epi_Env_Screen_Batch2 2152 HTIF00059 2704
## 6143: 161219_Epi_Env_Screen_Batch2 2152 HTIF00059 2704
## MeasurementDate MeasurementID WellName Row Column Timepoint
## 1: 2016-12-13T02:51:57Z 2594 A1 1 1 1
## 2: 2016-12-13T02:51:57Z 2594 A2 1 2 1
## 3: 2016-12-13T02:51:57Z 2594 A3 1 3 1
## 4: 2016-12-13T02:51:57Z 2594 A4 1 4 1
## 5: 2016-12-13T02:51:57Z 2594 A5 1 5 1
## ---
## 6139: 2016-12-20T01:33:52Z 2629 P20 16 20 1
## 6140: 2016-12-20T01:33:52Z 2629 P21 16 21 1
## 6141: 2016-12-20T01:33:52Z 2629 P22 16 22 1
## 6142: 2016-12-20T01:33:52Z 2629 P23 16 23 1
## 6143: 2016-12-20T01:33:52Z 2629 P24 16 24 1
## Plane Nuclei - Number of Objects
## 1: 1 1134
## 2: 1 1565
## 3: 1 699
## 4: 1 1211
## 5: 1 1522
## ---
## 6139: 1 1187
## 6140: 1 1235
## 6141: 1 1120
## 6142: 1 14
## 6143: 1 312
## Nuclei - Nuclei Selected - Mean per Well
## 1: 0.7813051
## 2: 0.8089457
## 3: 0.7839771
## 4: 0.8191577
## 5: 0.8134034
## ---
## 6139: 0.8112890
## 6140: 0.8275304
## 6141: 0.7758929
## 6142: 0.7142857
## 6143: 0.7147436
## Nuclei Selected - Number of Objects
## 1: 886
## 2: 1266
## 3: 548
## 4: 992
## 5: 1238
## ---
## 6139: 963
## 6140: 1022
## 6141: 869
## 6142: 10
## 6143: 223
## Nuclei Selected - Nucleus Area [µm²] - Mean per Well
## 1: 190.6056
## 2: 179.5904
## 3: 205.4739
## 4: 187.1641
## 5: 171.9464
## ---
## 6139: 189.2278
## 6140: 179.5924
## 6141: 189.7720
## 6142: 136.2944
## 6143: 249.7675
## Nuclei Selected - Nucleus Roundness - Mean per Well
## 1: 0.9611200
## 2: 0.9686175
## 3: 0.9648931
## 4: 0.9644785
## 5: 0.9676596
## ---
## 6139: 0.9620000
## 6140: 0.9684931
## 6141: 0.9665236
## 6142: 0.8796078
## 6143: 0.9689305
## Nuclei Selected - Nucleus Width [µm] - Mean per Well
## 1: 12.507684
## 2: 12.415063
## 3: 13.170453
## 4: 12.498291
## 5: 12.152409
## ---
## 6139: 12.484556
## 6140: 12.266421
## 6141: 12.571911
## 6142: 9.879657
## 6143: 14.440401
## Nuclei Selected - Nucleus Length [µm] - Mean per Well
## 1: 17.66686
## 2: 16.94635
## 3: 18.41283
## 4: 17.43507
## 5: 16.44493
## ---
## 6139: 17.58874
## 6140: 17.02906
## 6141: 17.61077
## 6142: 15.78144
## 6143: 20.41251
## Nuclei Selected - Nucleus Ratio Width to Length - Mean per Well
## 1: 0.7127882
## 2: 0.7367895
## 3: 0.7215694
## 4: 0.7196473
## 5: 0.7410293
## ---
## 6139: 0.7140425
## 6140: 0.7251587
## 6141: 0.7168348
## 6142: 0.6336706
## 6143: 0.7146709
## Nuclei Selected - Intensity Nucleus Green Mean - Mean per Well
## 1: 330.8730
## 2: 259.4851
## 3: 365.1034
## 4: 298.1092
## 5: 262.2875
## ---
## 6139: 205.9935
## 6140: 217.6720
## 6141: 215.4974
## 6142: 465.2581
## 6143: 424.4270
## Nuclei Selected - Intensity Nucleus Red Mean - Mean per Well
## 1: 376.3647
## 2: 290.4706
## 3: 473.5955
## 4: 358.3449
## 5: 299.1064
## ---
## 6139: 317.2367
## 6140: 330.6010
## 6141: 348.8535
## 6142: 684.7642
## 6143: 813.8457
## Number of Analyzed Fields
## 1: 30
## 2: 30
## 3: 30
## 4: 30
## 5: 30
## ---
## 6139: 30
## 6140: 30
## 6141: 30
## 6142: 30
## 6143: 30
## Link Plate
## 1: http://columbus.nci.nih.gov/browse/measurement/2594/well=1.1 1
## 2: http://columbus.nci.nih.gov/browse/measurement/2594/well=1.2 1
## 3: http://columbus.nci.nih.gov/browse/measurement/2594/well=1.3 1
## 4: http://columbus.nci.nih.gov/browse/measurement/2594/well=1.4 1
## 5: http://columbus.nci.nih.gov/browse/measurement/2594/well=1.5 1
## ---
## 6139: http://columbus.nci.nih.gov/browse/measurement/2629/well=16.20 5
## 6140: http://columbus.nci.nih.gov/browse/measurement/2629/well=16.21 5
## 6141: http://columbus.nci.nih.gov/browse/measurement/2629/well=16.22 5
## 6142: http://columbus.nci.nih.gov/browse/measurement/2629/well=16.23 5
## 6143: http://columbus.nci.nih.gov/browse/measurement/2629/well=16.24 5
## Replicate Batch
## 1: 1 1
## 2: 1 1
## 3: 1 1
## 4: 1 1
## 5: 1 1
## ---
## 6139: 2 1
## 6140: 2 1
## 6141: 2 1
## 6142: 2 1
## 6143: 2 1
# Reformat the well names according to cellHTS2 specifications
dt_col[, WellName := sprintf("%s%02d",
str_extract(WellName, "\\w"),
as.numeric(str_extract(WellName, "\\d+")))]
## .id
## 1: 161212_Epi_Env_Screen_v3[103387].result.1.txt
## 2: 161212_Epi_Env_Screen_v3[103387].result.1.txt
## 3: 161212_Epi_Env_Screen_v3[103387].result.1.txt
## 4: 161212_Epi_Env_Screen_v3[103387].result.1.txt
## 5: 161212_Epi_Env_Screen_v3[103387].result.1.txt
## ---
## 6139: 161212_Epi_Env_Screen_v3[103438].result.1.txt
## 6140: 161212_Epi_Env_Screen_v3[103438].result.1.txt
## 6141: 161212_Epi_Env_Screen_v3[103438].result.1.txt
## 6142: 161212_Epi_Env_Screen_v3[103438].result.1.txt
## 6143: 161212_Epi_Env_Screen_v3[103438].result.1.txt
## ScreenName ScreenID PlateName PlateID
## 1: 161212_Epi_Env_Screen_Batch1 2146 HTIF00034 2667
## 2: 161212_Epi_Env_Screen_Batch1 2146 HTIF00034 2667
## 3: 161212_Epi_Env_Screen_Batch1 2146 HTIF00034 2667
## 4: 161212_Epi_Env_Screen_Batch1 2146 HTIF00034 2667
## 5: 161212_Epi_Env_Screen_Batch1 2146 HTIF00034 2667
## ---
## 6139: 161219_Epi_Env_Screen_Batch2 2152 HTIF00059 2704
## 6140: 161219_Epi_Env_Screen_Batch2 2152 HTIF00059 2704
## 6141: 161219_Epi_Env_Screen_Batch2 2152 HTIF00059 2704
## 6142: 161219_Epi_Env_Screen_Batch2 2152 HTIF00059 2704
## 6143: 161219_Epi_Env_Screen_Batch2 2152 HTIF00059 2704
## MeasurementDate MeasurementID WellName Row Column Timepoint
## 1: 2016-12-13T02:51:57Z 2594 A01 1 1 1
## 2: 2016-12-13T02:51:57Z 2594 A02 1 2 1
## 3: 2016-12-13T02:51:57Z 2594 A03 1 3 1
## 4: 2016-12-13T02:51:57Z 2594 A04 1 4 1
## 5: 2016-12-13T02:51:57Z 2594 A05 1 5 1
## ---
## 6139: 2016-12-20T01:33:52Z 2629 P20 16 20 1
## 6140: 2016-12-20T01:33:52Z 2629 P21 16 21 1
## 6141: 2016-12-20T01:33:52Z 2629 P22 16 22 1
## 6142: 2016-12-20T01:33:52Z 2629 P23 16 23 1
## 6143: 2016-12-20T01:33:52Z 2629 P24 16 24 1
## Plane Nuclei - Number of Objects
## 1: 1 1134
## 2: 1 1565
## 3: 1 699
## 4: 1 1211
## 5: 1 1522
## ---
## 6139: 1 1187
## 6140: 1 1235
## 6141: 1 1120
## 6142: 1 14
## 6143: 1 312
## Nuclei - Nuclei Selected - Mean per Well
## 1: 0.7813051
## 2: 0.8089457
## 3: 0.7839771
## 4: 0.8191577
## 5: 0.8134034
## ---
## 6139: 0.8112890
## 6140: 0.8275304
## 6141: 0.7758929
## 6142: 0.7142857
## 6143: 0.7147436
## Nuclei Selected - Number of Objects
## 1: 886
## 2: 1266
## 3: 548
## 4: 992
## 5: 1238
## ---
## 6139: 963
## 6140: 1022
## 6141: 869
## 6142: 10
## 6143: 223
## Nuclei Selected - Nucleus Area [µm²] - Mean per Well
## 1: 190.6056
## 2: 179.5904
## 3: 205.4739
## 4: 187.1641
## 5: 171.9464
## ---
## 6139: 189.2278
## 6140: 179.5924
## 6141: 189.7720
## 6142: 136.2944
## 6143: 249.7675
## Nuclei Selected - Nucleus Roundness - Mean per Well
## 1: 0.9611200
## 2: 0.9686175
## 3: 0.9648931
## 4: 0.9644785
## 5: 0.9676596
## ---
## 6139: 0.9620000
## 6140: 0.9684931
## 6141: 0.9665236
## 6142: 0.8796078
## 6143: 0.9689305
## Nuclei Selected - Nucleus Width [µm] - Mean per Well
## 1: 12.507684
## 2: 12.415063
## 3: 13.170453
## 4: 12.498291
## 5: 12.152409
## ---
## 6139: 12.484556
## 6140: 12.266421
## 6141: 12.571911
## 6142: 9.879657
## 6143: 14.440401
## Nuclei Selected - Nucleus Length [µm] - Mean per Well
## 1: 17.66686
## 2: 16.94635
## 3: 18.41283
## 4: 17.43507
## 5: 16.44493
## ---
## 6139: 17.58874
## 6140: 17.02906
## 6141: 17.61077
## 6142: 15.78144
## 6143: 20.41251
## Nuclei Selected - Nucleus Ratio Width to Length - Mean per Well
## 1: 0.7127882
## 2: 0.7367895
## 3: 0.7215694
## 4: 0.7196473
## 5: 0.7410293
## ---
## 6139: 0.7140425
## 6140: 0.7251587
## 6141: 0.7168348
## 6142: 0.6336706
## 6143: 0.7146709
## Nuclei Selected - Intensity Nucleus Green Mean - Mean per Well
## 1: 330.8730
## 2: 259.4851
## 3: 365.1034
## 4: 298.1092
## 5: 262.2875
## ---
## 6139: 205.9935
## 6140: 217.6720
## 6141: 215.4974
## 6142: 465.2581
## 6143: 424.4270
## Nuclei Selected - Intensity Nucleus Red Mean - Mean per Well
## 1: 376.3647
## 2: 290.4706
## 3: 473.5955
## 4: 358.3449
## 5: 299.1064
## ---
## 6139: 317.2367
## 6140: 330.6010
## 6141: 348.8535
## 6142: 684.7642
## 6143: 813.8457
## Number of Analyzed Fields
## 1: 30
## 2: 30
## 3: 30
## 4: 30
## 5: 30
## ---
## 6139: 30
## 6140: 30
## 6141: 30
## 6142: 30
## 6143: 30
## Link Plate
## 1: http://columbus.nci.nih.gov/browse/measurement/2594/well=1.1 1
## 2: http://columbus.nci.nih.gov/browse/measurement/2594/well=1.2 1
## 3: http://columbus.nci.nih.gov/browse/measurement/2594/well=1.3 1
## 4: http://columbus.nci.nih.gov/browse/measurement/2594/well=1.4 1
## 5: http://columbus.nci.nih.gov/browse/measurement/2594/well=1.5 1
## ---
## 6139: http://columbus.nci.nih.gov/browse/measurement/2629/well=16.20 5
## 6140: http://columbus.nci.nih.gov/browse/measurement/2629/well=16.21 5
## 6141: http://columbus.nci.nih.gov/browse/measurement/2629/well=16.22 5
## 6142: http://columbus.nci.nih.gov/browse/measurement/2629/well=16.23 5
## 6143: http://columbus.nci.nih.gov/browse/measurement/2629/well=16.24 5
## Replicate Batch
## 1: 1 1
## 2: 1 1
## 3: 1 1
## 4: 1 1
## 5: 1 1
## ---
## 6139: 2 1
## 6140: 2 1
## 6141: 2 1
## 6142: 2 1
## 6143: 2 1
dt_col
## .id
## 1: 161212_Epi_Env_Screen_v3[103387].result.1.txt
## 2: 161212_Epi_Env_Screen_v3[103387].result.1.txt
## 3: 161212_Epi_Env_Screen_v3[103387].result.1.txt
## 4: 161212_Epi_Env_Screen_v3[103387].result.1.txt
## 5: 161212_Epi_Env_Screen_v3[103387].result.1.txt
## ---
## 6139: 161212_Epi_Env_Screen_v3[103438].result.1.txt
## 6140: 161212_Epi_Env_Screen_v3[103438].result.1.txt
## 6141: 161212_Epi_Env_Screen_v3[103438].result.1.txt
## 6142: 161212_Epi_Env_Screen_v3[103438].result.1.txt
## 6143: 161212_Epi_Env_Screen_v3[103438].result.1.txt
## ScreenName ScreenID PlateName PlateID
## 1: 161212_Epi_Env_Screen_Batch1 2146 HTIF00034 2667
## 2: 161212_Epi_Env_Screen_Batch1 2146 HTIF00034 2667
## 3: 161212_Epi_Env_Screen_Batch1 2146 HTIF00034 2667
## 4: 161212_Epi_Env_Screen_Batch1 2146 HTIF00034 2667
## 5: 161212_Epi_Env_Screen_Batch1 2146 HTIF00034 2667
## ---
## 6139: 161219_Epi_Env_Screen_Batch2 2152 HTIF00059 2704
## 6140: 161219_Epi_Env_Screen_Batch2 2152 HTIF00059 2704
## 6141: 161219_Epi_Env_Screen_Batch2 2152 HTIF00059 2704
## 6142: 161219_Epi_Env_Screen_Batch2 2152 HTIF00059 2704
## 6143: 161219_Epi_Env_Screen_Batch2 2152 HTIF00059 2704
## MeasurementDate MeasurementID WellName Row Column Timepoint
## 1: 2016-12-13T02:51:57Z 2594 A01 1 1 1
## 2: 2016-12-13T02:51:57Z 2594 A02 1 2 1
## 3: 2016-12-13T02:51:57Z 2594 A03 1 3 1
## 4: 2016-12-13T02:51:57Z 2594 A04 1 4 1
## 5: 2016-12-13T02:51:57Z 2594 A05 1 5 1
## ---
## 6139: 2016-12-20T01:33:52Z 2629 P20 16 20 1
## 6140: 2016-12-20T01:33:52Z 2629 P21 16 21 1
## 6141: 2016-12-20T01:33:52Z 2629 P22 16 22 1
## 6142: 2016-12-20T01:33:52Z 2629 P23 16 23 1
## 6143: 2016-12-20T01:33:52Z 2629 P24 16 24 1
## Plane Nuclei - Number of Objects
## 1: 1 1134
## 2: 1 1565
## 3: 1 699
## 4: 1 1211
## 5: 1 1522
## ---
## 6139: 1 1187
## 6140: 1 1235
## 6141: 1 1120
## 6142: 1 14
## 6143: 1 312
## Nuclei - Nuclei Selected - Mean per Well
## 1: 0.7813051
## 2: 0.8089457
## 3: 0.7839771
## 4: 0.8191577
## 5: 0.8134034
## ---
## 6139: 0.8112890
## 6140: 0.8275304
## 6141: 0.7758929
## 6142: 0.7142857
## 6143: 0.7147436
## Nuclei Selected - Number of Objects
## 1: 886
## 2: 1266
## 3: 548
## 4: 992
## 5: 1238
## ---
## 6139: 963
## 6140: 1022
## 6141: 869
## 6142: 10
## 6143: 223
## Nuclei Selected - Nucleus Area [µm²] - Mean per Well
## 1: 190.6056
## 2: 179.5904
## 3: 205.4739
## 4: 187.1641
## 5: 171.9464
## ---
## 6139: 189.2278
## 6140: 179.5924
## 6141: 189.7720
## 6142: 136.2944
## 6143: 249.7675
## Nuclei Selected - Nucleus Roundness - Mean per Well
## 1: 0.9611200
## 2: 0.9686175
## 3: 0.9648931
## 4: 0.9644785
## 5: 0.9676596
## ---
## 6139: 0.9620000
## 6140: 0.9684931
## 6141: 0.9665236
## 6142: 0.8796078
## 6143: 0.9689305
## Nuclei Selected - Nucleus Width [µm] - Mean per Well
## 1: 12.507684
## 2: 12.415063
## 3: 13.170453
## 4: 12.498291
## 5: 12.152409
## ---
## 6139: 12.484556
## 6140: 12.266421
## 6141: 12.571911
## 6142: 9.879657
## 6143: 14.440401
## Nuclei Selected - Nucleus Length [µm] - Mean per Well
## 1: 17.66686
## 2: 16.94635
## 3: 18.41283
## 4: 17.43507
## 5: 16.44493
## ---
## 6139: 17.58874
## 6140: 17.02906
## 6141: 17.61077
## 6142: 15.78144
## 6143: 20.41251
## Nuclei Selected - Nucleus Ratio Width to Length - Mean per Well
## 1: 0.7127882
## 2: 0.7367895
## 3: 0.7215694
## 4: 0.7196473
## 5: 0.7410293
## ---
## 6139: 0.7140425
## 6140: 0.7251587
## 6141: 0.7168348
## 6142: 0.6336706
## 6143: 0.7146709
## Nuclei Selected - Intensity Nucleus Green Mean - Mean per Well
## 1: 330.8730
## 2: 259.4851
## 3: 365.1034
## 4: 298.1092
## 5: 262.2875
## ---
## 6139: 205.9935
## 6140: 217.6720
## 6141: 215.4974
## 6142: 465.2581
## 6143: 424.4270
## Nuclei Selected - Intensity Nucleus Red Mean - Mean per Well
## 1: 376.3647
## 2: 290.4706
## 3: 473.5955
## 4: 358.3449
## 5: 299.1064
## ---
## 6139: 317.2367
## 6140: 330.6010
## 6141: 348.8535
## 6142: 684.7642
## 6143: 813.8457
## Number of Analyzed Fields
## 1: 30
## 2: 30
## 3: 30
## 4: 30
## 5: 30
## ---
## 6139: 30
## 6140: 30
## 6141: 30
## 6142: 30
## 6143: 30
## Link Plate
## 1: http://columbus.nci.nih.gov/browse/measurement/2594/well=1.1 1
## 2: http://columbus.nci.nih.gov/browse/measurement/2594/well=1.2 1
## 3: http://columbus.nci.nih.gov/browse/measurement/2594/well=1.3 1
## 4: http://columbus.nci.nih.gov/browse/measurement/2594/well=1.4 1
## 5: http://columbus.nci.nih.gov/browse/measurement/2594/well=1.5 1
## ---
## 6139: http://columbus.nci.nih.gov/browse/measurement/2629/well=16.20 5
## 6140: http://columbus.nci.nih.gov/browse/measurement/2629/well=16.21 5
## 6141: http://columbus.nci.nih.gov/browse/measurement/2629/well=16.22 5
## 6142: http://columbus.nci.nih.gov/browse/measurement/2629/well=16.23 5
## 6143: http://columbus.nci.nih.gov/browse/measurement/2629/well=16.24 5
## Replicate Batch
## 1: 1 1
## 2: 1 1
## 3: 1 1
## 4: 1 1
## 5: 1 1
## ---
## 6139: 2 1
## 6140: 2 1
## 6141: 2 1
## 6142: 2 1
## 6143: 2 1
# Extract the variable to analyze in cellHTS2
dt_col_minimal <- dt_col[, .(PlateName = PlateName,
Plate = Plate,
Well = WellName,
Value = eval(meas_select))]
dt_col_minimal
## PlateName Plate Well Value
## 1: HTIF00034 1 A01 190.6056
## 2: HTIF00034 1 A02 179.5904
## 3: HTIF00034 1 A03 205.4739
## 4: HTIF00034 1 A04 187.1641
## 5: HTIF00034 1 A05 171.9464
## ---
## 6139: HTIF00059 5 P20 189.2278
## 6140: HTIF00059 5 P21 179.5924
## 6141: HTIF00059 5 P22 189.7720
## 6142: HTIF00059 5 P23 136.2944
## 6143: HTIF00059 5 P24 249.7675
Write out the measurement (One per plate) files out for cellHTS2 in the cellHTS2_input_Nuclei Selected - Nucleus Area [µm²] - Mean per Well directory.
dt_col_minimal[, write.table(.SD,
file = paste0(HTS2_input_dir,
"/",
unique(PlateName),
"_cellHTS2.txt"),
sep = "\t", row.names = F,
col.names = F,
quote = F,
na = "NaN"),
by = .(PlateName)]
## Empty data.table (0 rows) of 1 col: PlateName
Generate and writeout the cellHTS2 Platelist.txt file in the cellHTS2_input_Nuclei Selected - Nucleus Area [µm²] - Mean per Well directory.
dt_platelist <- dt_col[, .(Filename = paste0(unique(PlateName),
"_cellHTS2.txt"),
Plate = unique(Plate),
Replicate = unique(Replicate),
Batch = unique(Batch)),
by = PlateName]
# Delete Platename column as per cellHTS2 Platelist.txt specs
dt_platelist[, PlateName := NULL]
## Filename Plate Replicate Batch
## 1: HTIF00034_cellHTS2.txt 1 1 1
## 2: HTIF00035_cellHTS2.txt 1 2 1
## 3: HTIF00037_cellHTS2.txt 2 1 1
## 4: HTIF00038_cellHTS2.txt 2 2 1
## 5: HTIF00040_cellHTS2.txt 3 1 1
## 6: HTIF00041_cellHTS2.txt 3 2 1
## 7: HTIF00043_cellHTS2.txt 4 1 1
## 8: HTIF00044_cellHTS2.txt 4 2 1
## 9: HTIF00046_cellHTS2.txt 6 1 1
## 10: HTIF00047_cellHTS2.txt 6 2 1
## 11: HTIF00049_cellHTS2.txt 7 1 1
## 12: HTIF00050_cellHTS2.txt 7 2 1
## 13: HTIF00052_cellHTS2.txt 8 1 1
## 14: HTIF00053_cellHTS2.txt 8 2 1
## 15: HTIF00058_cellHTS2.txt 5 1 1
## 16: HTIF00059_cellHTS2.txt 5 2 1
write.table(dt_platelist,
paste0(HTS2_input_dir, "/Platelist.txt"),
quote = F,
sep = "\t",
row.names = F,
col.names = T)
Read the measurement files in the cellHTS2_input_Nuclei Selected - Nucleus Area [µm²] - Mean per Well directory.
HTS2_Object <- readPlateList(filename = "Platelist.txt",
name = as.character(meas_select),
path = HTS2_input_dir)
Configure the cellHTS2 object. This operation adds the description of the experiment to the results and adds the gene annotations.
HTS2_Object <- configure(HTS2_Object,
descripFile = "Description.txt",
confFile = "Plateconf.txt",
path = HTS2_input_dir)
table(wellAnno(HTS2_Object))
##
## empty neg sikiller silmnb pos sample
## 223 64 64 64 64 2593
Plot the plates layout.
configurationAsScreenPlot(HTS2_Object)
Normalize the plates. In this particular case, use the B-score normalization to reduce the impact of possible spatial artifacts on the plates.
HTS2_Object_n <- normalizePlates(HTS2_Object,
scale = "additive",
log = FALSE,
method = "Bscore",
varianceAdjust = "byPlate")
HTS2_Object_sc <- scoreReplicates(HTS2_Object_n, sign = "+", method = "zscore")
Summarize the replicates. In this particular case the final Z-score is going to be the mean of the two replicates Z-scores.
HTS2_Object_sc <- summarizeReplicates(HTS2_Object_sc, summary = "mean")
Annotate the cellHTS2 object with gene names and siRNA id’s.
HTS2_Object_sc <- annotate(HTS2_Object_sc,
geneIDFile = "Annotation.txt",
path = HTS2_input_dir)
Save the cellHTS2 object as an .rda file in the appropriate results subfolder in case someone wants to inspect it and/or further process it.
save(HTS2_Object_sc, file = paste0(HTS2_output_dir, "/HTS2_Object_sc.rda"))
Output the screen results as an index.html file in the appropriate results subfolder.
setSettings(list(plateList = list(reproducibility = list(include = TRUE, map = TRUE),
intensities = list(include = TRUE, map = TRUE)),
screenSummary = list(scores = list(range = c(-4, 8), map = TRUE))))
cellHTS2_report <- writeReport(raw = HTS2_Object,
normalized = HTS2_Object_n,
scored = HTS2_Object_sc,
outdir = HTS2_output_dir,
force = TRUE,
mainScriptFile = "siSilencer_select_Ambion_cellHTS2_analysis.Rmd")
## cellHTS2 is busy creating HTML pages for 'Nuclei Selected - Nucleus Area [µm²] - Mean per Well'.
## Found raw, normalized and scored data.
## State:
## configured=TRUE, annotated=TRUE
##
0% done (step 1 of 8)
14% done (step 2 of 8)
##
16% done (step 3 of 8)
##
18% done (step 3 of 8)
##
20% done (step 3 of 8)
##
22% done (step 3 of 8)
##
24% done (step 3 of 8)
##
26% done (step 3 of 8)
##
28% done (step 3 of 8)
##
30% done (step 3 of 8)
36% done (step 4 of 8)
##
51% done (step 5 of 8)
##
91% done (step 6 of 8)
95% done (step 7 of 8)
100% done (step 8 of 8)
Report was successfully generated in folder cellHTS2_output_Nuclei Selected - Nucleus Area [µm²] - Mean per Well/index.html
HTS2_results <- data.table(getTopTable(cellHTSlist = list("raw"=HTS2_Object,
"normalized"=HTS2_Object_n,
"scored"=HTS2_Object_sc),
file = paste0(HTS2_output_dir, "/Results_table.txt")))
HTS2_results_median <- HTS2_results[!(is.na(GeneID)),
.(median_score = median(score)),
by = .(GeneID,
GeneSymbol)][order(median_score, decreasing = F)]
write.table(HTS2_results_median,
paste0(HTS2_output_dir, "/Results_median.txt"),
col.names = T,
row.names = F,
quote = F,
sep = "\t")
HTS2_results[, UID := factor(paste(plate, well, sep = "_"))]
## plate position score well wellAnno finalWellAnno raw_r1_ch1
## 1: 3 19 18.83 A19 sample sample 686.1641
## 2: 1 19 14.22 A19 sample sample 409.0731
## 3: 1 102 12.67 E06 sample sample 424.7362
## 4: 2 19 11.66 A19 sample sample 387.3822
## 5: 7 196 7.64 I04 sample sample 318.4978
## ---
## 3068: 8 374 NaN P14 empty empty NA
## 3069: 8 376 NaN P16 empty empty NA
## 3070: 8 378 NaN P18 empty empty NA
## 3071: 8 380 NaN P20 empty empty NA
## 3072: 8 382 NaN P22 empty empty NA
## raw_r2_ch1 median_ch1 diff_ch1 raw/PlateMedian_r1_ch1
## 1: 356.7166 521.4403 -329.447485 3.57
## 2: 424.9112 416.9922 15.838136 2.15
## 3: 364.5262 394.6312 -60.210046 2.24
## 4: 425.9241 406.6531 38.541931 2.05
## 5: 313.6800 316.0889 -4.817844 1.69
## ---
## 3068: NA NA NA NA
## 3069: NA NA NA NA
## 3070: NA NA NA NA
## 3071: NA NA NA NA
## 3072: NA NA NA NA
## raw/PlateMedian_r2_ch1 normalized_r1_ch1 normalized_r2_ch1 GeneID
## 1: 1.78 29.191 8.471 983
## 2: 2.14 14.829 13.608 983
## 3: 1.83 15.849 9.499 9212
## 4: 2.14 11.618 11.697 983
## 5: 1.60 8.395 6.885 5685
## ---
## 3068: NA NA NA NA
## 3069: NA NA NA NA
## 3070: NA NA NA NA
## 3071: NA NA NA NA
## 3072: NA NA NA NA
## GeneSymbol siRNAID UID
## 1: CDC2 s463 3_A19
## 2: CDC2 s464 1_A19
## 3: AURKB s17612 1_E06
## 4: CDC2 s465 2_A19
## 5: PSMA4 s230030 7_I04
## ---
## 3068: NA NA 8_P14
## 3069: NA NA 8_P16
## 3070: NA NA 8_P18
## 3071: NA NA 8_P20
## 3072: NA NA 8_P22
HTS2_results[, UID := factor(UID, levels = UID[order(score)])]
## plate position score well wellAnno finalWellAnno raw_r1_ch1
## 1: 3 19 18.83 A19 sample sample 686.1641
## 2: 1 19 14.22 A19 sample sample 409.0731
## 3: 1 102 12.67 E06 sample sample 424.7362
## 4: 2 19 11.66 A19 sample sample 387.3822
## 5: 7 196 7.64 I04 sample sample 318.4978
## ---
## 3068: 8 374 NaN P14 empty empty NA
## 3069: 8 376 NaN P16 empty empty NA
## 3070: 8 378 NaN P18 empty empty NA
## 3071: 8 380 NaN P20 empty empty NA
## 3072: 8 382 NaN P22 empty empty NA
## raw_r2_ch1 median_ch1 diff_ch1 raw/PlateMedian_r1_ch1
## 1: 356.7166 521.4403 -329.447485 3.57
## 2: 424.9112 416.9922 15.838136 2.15
## 3: 364.5262 394.6312 -60.210046 2.24
## 4: 425.9241 406.6531 38.541931 2.05
## 5: 313.6800 316.0889 -4.817844 1.69
## ---
## 3068: NA NA NA NA
## 3069: NA NA NA NA
## 3070: NA NA NA NA
## 3071: NA NA NA NA
## 3072: NA NA NA NA
## raw/PlateMedian_r2_ch1 normalized_r1_ch1 normalized_r2_ch1 GeneID
## 1: 1.78 29.191 8.471 983
## 2: 2.14 14.829 13.608 983
## 3: 1.83 15.849 9.499 9212
## 4: 2.14 11.618 11.697 983
## 5: 1.60 8.395 6.885 5685
## ---
## 3068: NA NA NA NA
## 3069: NA NA NA NA
## 3070: NA NA NA NA
## 3071: NA NA NA NA
## 3072: NA NA NA NA
## GeneSymbol siRNAID UID
## 1: CDC2 s463 3_A19
## 2: CDC2 s464 1_A19
## 3: AURKB s17612 1_E06
## 4: CDC2 s465 2_A19
## 5: PSMA4 s230030 7_I04
## ---
## 3068: NA NA 8_P14
## 3069: NA NA 8_P16
## 3070: NA NA 8_P18
## 3071: NA NA 8_P20
## 3072: NA NA 8_P22
ggplot(HTS2_results[!(wellAnno %in% c("sikiller", "empty")),],
aes(x = UID,
y = score,
color = wellAnno)) +
scale_y_continuous(lim = c(-7.5, 5), breaks = seq(-7.5, 5, 2.5)) +
geom_point(size = 1.5, alpha = 0.5) +
scale_color_solarized() +
scale_fill_solarized() +
theme_solarized(light = F)
ggplot(HTS2_results[!(wellAnno %in% c("sikiller", "empty")), ],
aes(
x = score,
y = ..density..,
color = wellAnno,
fill = wellAnno
)) +
geom_density(alpha = 0.3) +
scale_x_continuous(lim = c(-7.5, 5), breaks = seq(-7.5, 5, 2.5)) +
scale_color_solarized() +
scale_fill_solarized() +
theme_solarized(light = F)
ggplot(HTS2_results[!(wellAnno %in% c("sikiller", "empty")),],
aes(x = normalized_r1_ch1,
y = normalized_r2_ch1,
color = wellAnno)) +
scale_x_continuous(lim = c(-7.5, 5), breaks = seq(-7.5, 5, 2.5)) +
scale_y_continuous(lim = c(-7.5, 5), breaks = seq(-7.5, 5, 2.5)) +
geom_point(alpha = 0.3) +
geom_smooth(method = "lm")
samples <- HTS2_results[wellAnno == "sample",]
samples[, cor.test(normalized_r1_ch1, normalized_r2_ch1, method = "spearman")]
##
## Spearman's rank correlation rho
##
## data: normalized_r1_ch1 and normalized_r2_ch1
## S = 427540000, p-value < 2.2e-16
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
## rho
## 0.8501064
Run the RSA analysis on the cellHTS2 object and save the results as Results_table_RSA.txt. Important note: the RSA analysis also includes positive and negative controls, whereas here we are saving to file only the sample wells. In addition also output a Results_table_RSA_filter.txt table which contains only genes for which at least 2 siRNA oligos give a phenotype 1.5 MAD’s away from the sample population median.
HTS2_rsa <- data.table(rsa(HTS2_Object_sc, reverse = T))
HTS2_rsa <- HTS2_rsa[!(is.na(GeneID)),]
HTS2_rsa[, Plate := as.integer(Plate)]
## GeneID Plate Well Score RSARank ScoreRank PValue RSAHit
## 1: 983 3 A19 18.83 1 1 1.038942e-09 1
## 2: 983 1 A19 14.22 2 2 1.038942e-09 1
## 3: 983 2 A19 11.66 3 4 1.038942e-09 1
## 4: 23165 8 D11 6.14 4 9 3.999927e-07 1
## 5: 23165 7 D11 5.15 5 18 3.999927e-07 1
## ---
## 2589: 6795 3 M13 0.64 1346 1346 1.075531e-01 0
## 2590: 27125 5 G15 0.66 1346 1346 1.049677e-02 0
## 2591: 84261 5 O07 0.67 1346 1346 8.403318e-03 0
## 2592: 81671 8 F02 0.82 1346 1346 1.675943e-02 0
## 2593: 55929 4 L13 0.84 1346 1346 1.201864e-01 0
## #HitWell #TotalWell %HitWell
## 1: 3 3 100
## 2: 3 3 100
## 3: 3 3 100
## 4: 3 3 100
## 5: 3 3 100
## ---
## 2589: 2 3 67
## 2590: 2 3 67
## 2591: 3 3 100
## 2592: 2 3 67
## 2593: 2 3 67
setkey(HTS2_rsa, Plate, Well)
setkey(dt_annotation, Plate, Well)
HTS2_rsa <- HTS2_rsa[dt_annotation, nomatch = 0][order(RSARank)]
write.table(HTS2_rsa,
paste0(HTS2_output_dir, "/Results_table_RSA.txt"),
col.names = T,
row.names = F,
quote = F,
sep = "\t")
HTS2_rsa[, Flag := Score > 1.5]
## GeneID Plate Well Score RSARank ScoreRank PValue RSAHit
## 1: 983 3 A19 18.83 1 1 1.038942e-09 1
## 2: 983 1 A19 14.22 2 2 1.038942e-09 1
## 3: 983 2 A19 11.66 3 4 1.038942e-09 1
## 4: 23165 8 D11 6.14 4 9 3.999927e-07 1
## 5: 23165 7 D11 5.15 5 18 3.999927e-07 1
## ---
## 2589: 54968 8 P06 -1.07 1346 1346 8.353398e-01 0
## 2590: 444 8 P08 -0.49 1346 1346 1.000000e+00 0
## 2591: 160418 8 P10 0.26 1346 1346 2.496345e-01 0
## 2592: 931 8 P13 -0.60 1346 1346 1.393887e-01 0
## 2593: 11284 8 P21 -0.11 1346 1346 1.889190e-01 0
## #HitWell #TotalWell %HitWell i.GeneID GeneSymbol siRNAID Flag
## 1: 3 3 100 983 CDC2 s463 TRUE
## 2: 3 3 100 983 CDC2 s464 TRUE
## 3: 3 3 100 983 CDC2 s465 TRUE
## 4: 3 3 100 23165 NUP205 s23177 TRUE
## 5: 3 3 100 23165 NUP205 s23176 TRUE
## ---
## 2589: 1 3 33 54968 TMEM70 s29880 FALSE
## 2590: 0 3 0 444 ASPH s270 FALSE
## 2591: 2 3 67 160418 TMTC3 s46199 FALSE
## 2592: 1 3 33 931 MS4A1 s2608 FALSE
## 2593: 1 3 33 11284 PNKP s22288 FALSE
HTS2_rsa_flag <- HTS2_rsa[, .(Flag2 = (sum(Flag) >= 2)), by = GeneID]
setkey(HTS2_rsa, GeneID)
setkey(HTS2_rsa_flag, GeneID)
HTS2_rsa <- HTS2_rsa[HTS2_rsa_flag, nomatch = 0][order(RSARank)]
HTS2_rsa_thresholded <- HTS2_rsa[Flag2 == TRUE,]
write.table(HTS2_rsa_thresholded,
paste0(HTS2_output_dir, "/Results_table_RSA_filter.txt"),
col.names = T,
row.names = F,
quote = F,
sep = "\t")
Document the information about the analysis session
sessionInfo()
## R version 3.3.2 (2016-10-31)
## Platform: x86_64-apple-darwin13.4.0 (64-bit)
## Running under: macOS Sierra 10.12.2
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] grid parallel stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] ggthemes_3.3.0 cellHTS2_2.36.0 locfit_1.5-9.1
## [4] hwriter_1.3.2 vsn_3.40.0 splots_1.38.0
## [7] genefilter_1.54.2 Biobase_2.32.0 BiocGenerics_0.18.0
## [10] RColorBrewer_1.1-2 data.table_1.10.0 knitr_1.15.1
## [13] stringr_1.1.0 ggplot2_2.2.0 plyr_1.8.4
##
## loaded via a namespace (and not attached):
## [1] splines_3.3.2 lattice_0.20-34 pcaPP_1.9-61
## [4] colorspace_1.3-2 htmltools_0.3.5 stats4_3.3.2
## [7] Category_2.38.0 yaml_2.1.14 RBGL_1.48.1
## [10] survival_2.40-1 XML_3.98-1.5 DBI_0.5-1
## [13] affy_1.50.0 affyio_1.42.0 robustbase_0.92-7
## [16] zlibbioc_1.18.0 munsell_0.4.3 gtable_0.2.0
## [19] mvtnorm_1.0-5 memoise_1.0.0 evaluate_0.10
## [22] labeling_0.3 IRanges_2.6.1 BiocInstaller_1.22.3
## [25] AnnotationDbi_1.34.4 preprocessCore_1.34.0 DEoptimR_1.0-8
## [28] GSEABase_1.34.1 Rcpp_0.12.8 xtable_1.8-2
## [31] scales_0.4.1 backports_1.0.4 limma_3.28.21
## [34] S4Vectors_0.10.3 graph_1.50.0 annotate_1.50.1
## [37] digest_0.6.10 stringi_1.1.2 rprojroot_1.1
## [40] tools_3.3.2 bitops_1.0-6 magrittr_1.5
## [43] lazyeval_0.2.0 RCurl_1.95-4.8 tibble_1.2
## [46] RSQLite_1.1-1 cluster_2.0.5 rrcov_1.4-3
## [49] MASS_7.3-45 Matrix_1.2-7.1 prada_1.48.0
## [52] assertthat_0.1 rmarkdown_1.3